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INTRODUCTION 


1.  NATURE  OF  THE  PROBLEM 

The  often  fatal  condition  of  botulism  is  caused  by  a  group  of  highly  toxic  proteins 
(botulinum  neurotoxin,  BoNT)  produced  by  certain  species  of  Clostridia,  principally 
Clostridium  botulinum  (Sugiyama,  1980).  On  the  basis  of  their  serological  properties,  seven 
distinct  types  of  BoNT  are  recognised,  and  have  been  designated  BoNT/A  to  G.  They  exert 
their  effects  on  vertebrates  by  blocking  the  release  of  the  neurotransmitter  acetylcholine  in 
presynaptic  nerve  termini,  resulting  in  neuromuscular  paralysis  (Habermann  and  Dreyer,  1986; 
Simpson,  1989).  Although  BoNT  is  synthesised  as  a  single  polypeptide  chain  (M^ 
approximately  150,  000),  proteolytic  cleavage  generates  the  more  toxic  dichain  form,  in  which 
a  50  000  Da  polypeptide  light  (L)  chain  and  a  100  000  Da  heavy  (H)  chain  are  linked  by  a 
disulphidryl  bridge.  The  different  types  of  Clostridium  botulinum  exhibit  differential 
efficiencies  in  nicking  of  the  single  chain  to  the  dichain  form.  Thus,  BoNT/A  exists 
principally  as  a  dichain,  BoNT/B  exists  as  a  mixture  of  predominantly  single  chain  with  some 
dichain,  whereas  BoNT/E  is  found  essentially  only  in  the  single  chain  form  (Dasgupta,  1990). 
Purified  single  chain  toxin  may  be  converted  to  the  dichain  form  in  vitro  by  proteolytic 
cleavage  with  trypsin  (Dolly  et  al.,  1984). 

The  overall  structure  and  mode  of  action  of  BoNT  is  shared  by  a  second  clostridial  toxin, 
namely  tetanus  (TeTx)  of  Clostridium  tetani  (Welloner,  1982).  They  differ  in  that  whereas 
BoNT  acts  at  the  nerve  periphery,  TeTx  blocks  the  release  of  inhibitory  amino  acids  in  the 
central  nervous  system.  The  neuroparalytic  action  of  both  types  of  neu'otoxin  has  been 
suggested  (Simpson,  1986)  to  be  composed  of  three  distinct  phases:  (i)  binding  of  the  toxin  to 
neurone  acceptor  sites;  (ii)  an  energy-dependent  internalisation  stage  in  which  the  toxin,  or  part 
of  it,  enters  the  nerve  cell,  and;  (iii)  the  eventual  blockade  of  neurotransmitler  release. 
Although  the  exact  mechanisms  involved  remain  poorly  understood,  it  is  generally  assumed 
that  the  L  chain  possesses  the  pharmacological  activity  (Bittner  et  al.,  1989;  Ahner  -Higler  et 
al.,  1989)  and  the  H  chain  is  responsible  for  binding  of  the  dichain  to  cell  surface  acceptors 
and  thereafter  internalisation  through  the  cell  membrane  (Simpson  ,  1989).  Some  evidence  has 
been  obtained  suggesting  that  the  channel  forming  activity  resides  in  the  NH^-terminal  portion 
of  the  H  chain  (Mcchida  et  al.,  1989;  Poulain  et  al.,  1990)  and  acceptor  recognition  sites  in  the 
COOH-terminus  (Morris  et  al.,  1981;  Shone  et  al.,  1985;  Kozaki  et  al.,  1987;  1989). 

The  effectiveness  of  modem  food-preserving  processes  in  Western  countries  has  made 
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outbreaks  of  botulism  extremely  rare.  The  frequent  use  of  C.botulinum  as  a  test  organism  in 
the  food  industry,  and  the  growing  use  of  the  toxin  by  neurobiochemists,  has,  however,  led  to 
the  development  of  human  vaccines.  The  formulation  of  these  vaccines  has  changed  little  since 
the  early  1950s:  partially  purified  preparations  of  the  neurotoxins  are  toxoided  by 
formaldehyde  treatment  and  absorbed  onto  precipitated  aluminium  salts.  Using  such 
methodology,  polyvalent  vaccines  (against  ABODE  or  ABEF)  fo:*  human  immunisation  are 
currently  available.  Such  vaccines  suffer  from  the  drawback  of  low  immune  response  and 
considerable  batch  to  batch  variation  due  to  the  high  proportion  (60-90%)  of  contaminating 
proteins  in  toxoid  preparations.  Recent  work  has  therefore  concentrated  on  the  development  of 
procedures  for  the  purification  of  toxins  to  near-homogeneity.  This  has  been  achieved  with  all 
•  but  type  G  toxin  (Shone  et  al.  1985;  Evans  et  al.,  1987;  Schmitt  et  al.,  1986).  The  use  of 
purified  toxins  in  the  production  of  vaccines,  however,  suffers  from  the  drawbacks  of  having 
to  produce  them  under  high  containment  and  requires  the  presence  of  low  levels  of 
formaldehyde  to  prevent  possible  reversion  of  the  toxoid  to  the  active  state. 


2.  BACKGROUND  OF  PREVIOUS  WORK 

Production  of  subunit  vaccines  have  been  investigated  by  a  number  of  laboratories.  In 
general,  individual  toxin  subunits  produce  poor  immune  responses.  A  non-toxic  fragment 
comprising  the  L-chain  and  the  N-terminal  portion  of  the  H-chain  (analogous  to  the  AB 
fragment  of  tetanus  toxin)  of  type  A  toxin  has  been  shown  to  produce  an  immune  response  in 
guinea  pigs  comparable  to  the  entire  toxin  (Shone  and  Hambleton,  1989).  It  has  therefore  been 
argued  that  production  of  such  a  toxoid  polypeptide  by  recombinant  means  provides  an 
excellent  candidate  for  future  vaccines.  This  would  mo.'^t  simply  be  achieved  by  insertion  of 
the  appropriate  coding  sequences  into  specialised  bacterial  vectors,  which  then  direct  the 
expression  of  high  levels  of  the  protein  in  suitable  bacterial  hosts.  The  unparalleled 
sophistication  of  recombinant  procedures  and  vectors  of  E.coli  has  resulted  in  this 
enterobacteria  being  the  organism  of  choice  in  such  processes.  There  are,  however,  a  number 
of  factors  v/hich  suggest  that  E.coli  is  not  the  best  candidate  for  undertaking  the  expression  of 
clostridial  toxin  genes. 

Although  clostridial  genes  are  reported  to  express  moderately  well  in  E.  coli  (reviewed  by 
Young  et  al.,  1989),  this  finding  only  applies  to  genes  isolated  from  mesophiles  encoding 
proteins  substantially  smaller  (c.  30-40,  000  Da)  than  BoNT,  or  thermophilic  genes  (eg.,  from 
C.themocellum)  whose  G  -f-  C  content  closely  matches  that  of  E.  coli.  Attempts  to  express 
clostridial  genes  encoding  large  polypeptides  have  met  with  either  very  limited  success  (eg., 
type  A  toxin  of  Clostridium  difficile.  Von  Eichel-Streiber,  1989)  or  total  failure  (eg.,  the 
bacteriocir.  of  the  Clostridium  perfnngens  plasmid  pIP401,  Gamier  and  Cole,  1988).  More 
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germane  to  BoNT  have  been  the  attempts  to  obtain  expression  of  polypeptide  fragments  of 
TeTx.  In  the  study  of  Eisel  et  al.  (1986)  various  subfragments  of  the  gene  were  expressed  in 
E.  coH,  either  initiating  from  the  tetanus  ATG  or  as  fusion  proteins.  The  levels  attained  were 
extremely  poor,  and  it  was  concluded  that  no  clone  "led  to  the  synthesis  of  sufficient  amounts 
of  toxin-specific  protein  to  allow  biological  studies.  At  present  these  considerations  argue 
against  a  large-scale  production  of  toxoid  based  on  genetically  engineered  non-toxic 
derivative."  Similar  results  were  obtained  by  Fairweather  et  al.  (1986,  1987),  who  expressed 
the  C-terminal  portion  of  the  toxin  (43%  of  the  molecule)  to  levels  less  than  1  %  of  the  cell's 
soluble  protein.  More  recently,  attempts  to  express  subfragments  encoding  either  the  L-chain 
or  substantial  portions  of  the  H-chain  of  the  type  A  gene  have  met  with  little  success  (A.H. 
Bingham,  personal  communication).  A  further  difficulty  encountered  in  all  these  studies  was 
considerable  degradation  of  the  polypeptides  produced,  even  in  protease  minus  Kcoli  hosts. 

The  reasons  for  the  observed  inefficient  expression  of  large  clostridial  toxin  genes  would 
appear  complex,  but  the  apparent  translational  barrier  is  suggested  (Eisel  et  al.,  1986;  Cole  and 
Gamier,  1988)  to  be  a  consequence  of  the  extremely  biased  codon  usage  exhibited  by 
clostridial  genes.  Thus  genes  isolated  from  Clostridium  spp.  whose  genomic  DNA  is  of  a 
high  A+T  content  (greater  than  70%  A+T),  exhibit  an  extremely  strong  discrimination 
against  all  degenerate  codons  ending  in  C  or  G,  or,  in  the  case  of  Ser  and  Arg,  beginning  with 
C.  In  the  case  of  the  neurotoxin  type  A  gene  (Thompson  et  al.,  1990),  86.1%  of  Arg  codons 
conform  to  AGN  rather  than  CGN,  69%  of  Leu  codons  conform  to  UUA  as  opposed  to  CUN, 
while  overall,  90.3%  of  the  degenerate  codons  end  in  A  or  U.  In  the  tetanus  toxin  gene  the 
equivalent  respective  figures  are  92. 1  %,  69.3%  and  92.9%.  A  consequence  of  this  codon  bias 
is  that  many  of  those  codons  known  to  act  as  modulators  of  gene  expression  in  E.coli 
(Grosjean  and  Fiers,  1982)  occur  extremely  frequently  in  clostridial  genes,  eg.,  the  type  A 
neurotoxin  gene  exhibits  a  53.8%  preference  for  AUA  (He),  43.7%  preference  for  GGA  (Gly) 
and  an  overall  86. 1  %  preference  of  AGN  (Arg)  modulator  codons.  It  would  appear  that 
although  E.  coli  can  tolerate  a  certain  number  of  such  codons,  as  occurs  in  genes  of  moderate 
size,  the  cumulative  effect  of  the  sheer  volume  of  modulator  codons  present  in  clostridial 
neurotoxin  genes  results  in  a  dramatic  reduction  in  translational  efficiency.  The  most  logical 
solution  to  these  problems  would  be  to  use  a  clostridial  host,  rather  than  E.coli. 

3.  PURPOSE  OF  THE  PRESENT  WORK 

The  production  of  a  polyvalent  vaccine  against  all  known  types  of  botulinum  neurotoxins 
requires  the  availability  of  large  quantities  of  pure  protein  which  is;  (i)  capable  of  eliciting  the 
production  of  neutralising  antibody  in  humans  and;  (ii)  non-toxic  to  personnel  involved  in  its 
isolation,  purification  and  formulatiop.  into  a  vaccine.  These  criteria  cannot  be  currently  met 
by  producing  authentic  neurotoxin  from  natural  clostridial  strains.  Although  the  desired 
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subunit  vaccine  could  conceivably  be  produced  by  recombinant  means,  as  discussed  above, 
translational  barriers  suggest  that  E.  coli  cannot  be  employed  as  the  recombinant  host.  A 
major  objective  of  this  study  was  therefore  to  develop  a  clostridial  expression  system,  ideally 
based  on  a  non-toxinogenic  host  closely  related  to  C.botulinum,  and  test  its  utility  by 
expressing  various  non-toxic  polypeptides  (principally  derived  from  the  H-chain  moiety)  of  the 
type  A  neurotoxin.  The  immunogenicity  of  these  recombinant  polypeptides  would  then  be 
evaluated  as  potential  subunit  vaccines.  In  parallel,  the  second  principal  objective  has  been  to 
clone  other  neurotoxin  genes  (types  B,  E,  F  and  G)  and  derive  their  complete  primary  amino 
acid  sequences  by  nucleotide  sequence  analysis.  Selected  polypeptides  of  these  neurotoxins 
could  then  also  be  produced  using  the  recombinant  host/vector  system  and  their  potential  as 
subunit  vaccines  ascertained.  At  the  end  of  these  studies  it  was  anticipated  that  a  system  for 
producing  high  levels  of  non-toxinogenic  neurotoxin  polypeptides  will  have  been  developed 
which  may  be  used  in  the  fonnulation  of  a  general  botulism  vaccine  against  types  A,  B,  E,  F 
and  G.  Furthermore,  the  availability  of  the  complete  primary  amino  acid  sequences  of  these 
toxins  will  facilitate  future  work  which  may  be  aimed  at  deriving  vaccines  based  on  synthetic 
peptides. 


4.  METHODS  OF  APPROACH 

4.1  Development  of  a  Clostridial  Expression  System 

Our  initial  strategy  was  to  choose  a  Clostridium  sp.,  taxonomically  closely  related  to  C. 
bomlinum,  and  formulate  procedures  for  introducing  recombinant  DNA.  Our  choice  as  the 
host  was  Clostridium  sporogenes  (taxonomically  considered  to  be  a  non-toxigenic  species  of 
C.  botulinum;  Cato  and  Stackebrandt,  1989)  and  the  DN.A  transfer  procedures  investigated, 
electroporation  and  conjugative  transfer.  The  former  procedure,  ubiquitous  in  its  application 
to  Gram-positive  bacteria  (see  Chassy  et  al.,  1988;  Lucansky  et  al.,  1988),  relies  on  the 
transient  introduction  of  pores  into  the  cell  membrane,  by  applying  an  electrical  discharge 
across  cell  suspensions,  through  which  exogenous  DN.A  may  pass.  We  have  previously  used 
electroporation  for  the  successful  introduction  of  plasmids  into  C.  acetobutylicum  (Oultram  et 
al.,  1988a),  and  similar  protocols  have  been  published  for  C.  perfringens  (eg.,  Allen  and 
Blaschek,  1988).  Conjugative  transfer  relies  on  mobilisation  of  the  cloning  vector  into  C. 
sporogenes  by  intergeneric  matings  (Trieu-Cuot  et  al.,  1987).  We  have  constructed  a  plasmid, 
pMTL30  (Williams  et  al.,  1989),  which  carries  the  ColEl  replicon,  the  Gram-positive 
erythromycin  (Em)  resistance  (0  gene  of  pAMBl  (Brehm  et  al.,  1987),  the  E.coli 
/acZ'/multiple  cloning  region  of  pMTL20  (Chambers  et  al.,  1988),  and  the  oriT  region  of 
plasmid  RK2.  Plasmid  derivatives,  in  which  the  replication  origins  of  either  pCBlOl 
Clostridium  butyricum  plasmid;  Minton  and  Morris,  1981)  or  the  streptococcal  plasmid 
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pAMBl  have  been  inserted,  have  been  shown  to  be  mobilised  from  an  E.coli  donor  to  an  C. 
aceiobutylicum  recipient  at  frequencies  of  up  to  10'*  per  donor  (Williams  et  al.,  1989a;  1989b). 
In  out  attempts  to  transfer  DNA  into  various  strains  of  C.  sporogenes  the  plasmid  vehicles 
utilised  were  endowed  with  the  replicative  origins  of  either  pCBlOl  or  pAMBl.  The  latter 
type  of  vector  was  preferred  as  it  has  proven  to  possess  an  extremely  broad  host  range  amongst 
Gram-positive  bacteria,  and  exhibits  a  high  degree  of  structural  stability  (Bruand  et  ai.,  1990; 
Swinfield  et  al.,  1990).  As  it  cannot  be  assumed  that  these  replicons  will  function  in  C. 
sporogenes  it  was  envisaged  that  replicons  could  be  cloned  from  indigenous  C.  sporogenes 
cryptic  plasmids  into  3  different  types  of  "in-house"  Gram-positive  replicon  cloning  vector 
(ie.,  plasmids  only  capable  of  replicating  in  E.coli).  These  vectors  (pMTL20E,  pMTL20C  and 
pMTL20T)  carry  three  different  Gram-positive  resistance  genes  (ernj,  cat  and  tetP, 
respectively),  all  of  which  have  been  shown  to  express  in  Clostridium  spp.  (see  Minton  and 
Oultram,  1988;  Abraham  and  Rood,  1985). 

Having  formulated  procedures  for  DNA  transfer  we  proposed  to  endow  constructed  shuttle 
vectors  with  efficient  transcription/  translation  signals  to  facilitate  high  expression  of 
appropriately  inserted  heterologous  genes.  Since  ribosomal  RNA  (rRNA)  operons  are  generally 
transcribed  efficiently  it  was  proposed  that  the  rRNA  genes  of  C.  sporogenes  would  be  the 
source  of  transcriptional  initiation  and  termination  rignals.  Once  cloned  and  characterised  the 
identified  promoter  region  was  to  be  modified  by  advanced  genetic  engineering  (ie.,  creation  of 
restriction  sites  by  site-directed  m.utagenesis  and  insertion  of  required  sequences  as  synthesised 
"units")  to  create  an  expression  cartridge.  This  would  consist  of  a  portable  restriction 
fragment,  carrying  (in  sequential  order):  the  rRNA  promoter  -35  and  -10  elements;  a  synthetic 
E.coli  laeZ  opierator  sequence  positioning  immediately  following  the  rRNA  -Hi;  a  synthetic 
ribosome  binding  site  (SD)  complementary  to  the  determined  C.  sporogenes  16s  RNA;  at  an 
appropriate  distance  from  the  SD,  a  recognition  sequence  for  Ndel  (CATATG),  followed  by 
the  /flcZ'/multiple  cloning  sites  of  plasmid  pMTL20,  whereby  the  ATG  represents  the 
translational  initiation  codon  of  lacZ';  finally  the  laeZ'  region  would  be  followed  by  the 
transcriptional  termination  signals  of  the  rRNA  operon.  The  efficiency  of  the  system  could  be 
tested  using  a  suitable  promoter-less  reporter  gene,  eg.,  cat.  The  presence  of  tiie  laeZ  operator 
site  should  allow  repression  of  expression  during  construction  in  E.coli  (by  the  presence  of  the 
high  copy  number  lacP  plasmid  pNM52,  Gilbert  et  al.,  1986),  and  thereafter  regulated 
expression  of  the  gene  in  Clostridia.  It  was  envisaged  that  this  could  be  achieved  in  an 
analogous  fashion  to  that  used  in  B.subtilis  (Le  Grice  et  al.,  1987),  where  a  plasmid  borne 
copy  of  the  lacP  gene  is  placed  under  the  transcriptional  control  of  a  moderate  clostridial 
promoter  (we  will  use  the  Clostridium  pasteurianum  leuB  promoter,  cloned  and  sequenced  in 
this  laboratory),  and  induction  of  the  rRNA  expression  cartiidge  elicited  by  addition  of  IPTG. 

Our  subsequent  failure  to  elicit  demonstrable  DNA  transfer  to  any  of  the  strains  of  C. 
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sporogenes  tested  necessitated  a  substitution  of  the  intended  recombinant  host  with  C. 
acetobutylicum.  This  Clostridia  has  a  number  of  advantages  over  C.  sporogenes.  On  a 
practical  level,  we  have  already  developed  the  necessary  means  of  manipulating  this  species. 
Equally  as  important,  this  species  has  no  known  association  with  human  disease  and  should 
therefore  command  a  lower  Access  factor  in  any  proposed  recombinant  experiments.  The 
proposed  expression  of  BoNT  gene  subfragments  can  therefore  be  undertaken  at  a  lower 
category  of  containment.  Furthermore,  parallel  studies  undertaken  in  this  laboratory  have 
resulted  in  the  construction  of  an  expression  cartridge,  similar  to  that  described  above,  based 
on  the  promoter  of  the  ferredoxin  (Fd)  gene  of  Clostridium  pasteurianum.  This  promoter 
modified  by  the  insertion  of  the  E.  coli  lac  operator,  has  been  designated  the  fac  promoter  and 
shown  to  direct  the  expression  of  a  cat  gene  in  C.  acetobutylicum  NCIB  8052  to  between  5  and 
10%  of  the  cells'  soluble  protein.  Once  C.  sporogenes  was  abandoned  as  the  recombinant 
host,  efforts  were  therefore  switclied  to  attempting  to  obtain  loci  expression  in  NCIB  8052. 

4.2  Cloning  of  Botulinum  Neurotoxin  Genes: 

The  strategies  utilised  in  the  cloning  of  the  type  B,  E,  F  and  G  neurotoxin  genes  were 
devised  to  minimise  the  risk  of  obtaining  a  toxinogenic  E.  coli  recombinant  clone,  and 
mirrored  the  measures  taken  in  the  cloning  of  the  BoNT/A  gene,  botA  (Thompson  et  al., 
1990).  Thus,  as  both  L  and  H  chain  are  required  for  toxicity  (Simpson,  1989),  only  DNA 
fragments  encoding  principally  one  component  of  the  dichain  were  cloned.  Where  genomic 
fnigments  were  cloned,  their  coding  potential  was  determined  by  the  construction  of  genomic 
maps  using  botA  DNA  probes  in  Southern  blots.  Furthermore,  they  were  always  isolated  by 
two-stage  agarose  gel  size  fractionation  to  minimise  the  risk  of  cloning  contiguous  DNA 
fragments.  As  more  nucleotide  sequence  information  became  available,  specific  regions  were 
amplified  for  cloning  by  polymerase  chain  reaction  (PCR). 
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1.  CLONING  OF  THE  BoNT  GENES 

1.1  MATERIALS  AND  METHODS 
Bacterial  strains,  plasmids  and  culture  conditions. 

The  source  of  chromosomal  DNA  was  C.  botulinum  strain  B/Danish,  the  type  E  strain 
NCTC  11219,  the  type  F  Langeland  strain  and  the  type  G  strain  89G.  The  recombinant  host 
used  for  cloning  experiments  E.  coli  TGI  A[/ac-pro]  supE  thi  hsdDSI  F'-  traD36 proA^ 
lacfi  lacZAMlS).  Cloning  vectors  employed  were  plasmids  pMTL32  (this  study),  pMTL20 
(Chambers  et  al.,  1988),  pCRlOOO  (Mead  et  al.,  1991),  and  the  M13  phages  mpl8  and  mpl9 
(Yanisch'Perron,  1985).  C.  botulinum  was  cultivated  in  USA  II  broth  (2%  peptone,  1%  yeast 
extract,  1%  N-Z  amine,  0.05%  sodium  mercaptoacetate,  1%  glucose,  pH  7.4),  and  E.  coli  in 
L-broth  (1%  tryptone,  0.5%  yeast  extract,  0.5%  NaCl).  Solidified  medium  (L-agar)  consisted 
of  L-broth  with  the  addition  of  2%  (w/v)  agar  (Bacto.Difco).  Antibiotic  concentrations  used 
for  the  maintenance  and  the  selection  of  transformants  were  50  /xg/ml  ampicillin  (pMTL32/ 
pMTL20)  and  50  /xg/ml  kanamycin  (pCRlOOO).  Restriction  endonucleases  and  DNA 
modifying  enzymes  were  purchased  from  Northumbria  Biochemicals  Ltd,  Thq  polymerase 
from  United  States  Biochemical  Corporation  and  radiolabei  from  Amersham  International. 


Purification  and  manipulation  of  DNA. 

Transformation  of  E.  coli  and  large-scale  plasmid  isolation  procedures  were  ^  previously 
described  (Minton  et  al.,  1983).  Small-scale  plasmid  isolation  was  by  tlie  method  of  Holmes 
and  Quigley  (1981),  while  chromosomal  DNA  from  C.  botulinum  was  prepared  essentially  as 
described  by  Marmur  (1961).  Restriction  endonucleases  and  DNA  modifying  enzymes  were 
used  under  tlie  conditions  recommended  by  the  supplier.  Digests  were  electrophoresed  in  1  % 
agarose  slab  gels  on  a  standard  horizontal  system  (BRL  Model  H4),  employing  Tris-borate- 
EDTA  (0.09  M  Tris  borate,  0  M  EDl’A)  buffer.  Fragments  were  isolated  from  gels  using 
electroelution  (McDonnell  et  al.,  1977).  All  primary  cloning  procedures  were  undertaken 
under  United  Kingdom  ACGM  C2  containment  conditions,  and  total  cell  lysates  of  al' 
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recombinants  carrying  cloned  material  were  tested  in  mice  for  the  absence  of  toxic 
polypeptides. 


DSA/DNA  hybridisation  experiments. 

DNA  restriction  fragments  were  transferred  from  agarose  gels  to  "zeta  probe"  nylon 
membrane  using  the  procedure  of  Reed  and  Nfann  (1985).  After  partial  depurination  with  0.25 
M  HCL  (15  min),  DNA  was  transferred  in  0.4  M  NaOH  by  capillary  elution  for  between  4 
and  16  hours.  Bacterial  colonies  were  screened  for  desired  recombinant  plasmids  by  in  situ 
colony  hybridisation  (Grunstein  and  Hogness,  1975),  using  nitrocellulose  filter  disks 
(Schleicher  and  Schull,  0.22  /i^m).  The  gel  purified  botA  DNA  fragments  were  labelled  with 
dATP  using  a  multiprime  kit  supplied  by  Amersham  International.  Hybridisations  were 
carried  out  as  previously  described  (Thompson  et  al.,  1990),  at  temperatures  ranging  from  45 
to  60  °C. 


Nucleotide  sequence  of  bot  plasmid  inserts. 

The  nucleotide  sequences  of  plasmid  inserts  were  determined  by  a  number  of  different 
stra.tegies.  In  some  instances  thiC  entire  insert  was  excised,  circularised  by  treatment  with  T4 
iigase  and  size  fractionated  500-1000  bp  fragments  generated  by  sonication  cloned 
into  the  Smal  site  of  M13mpl8  (for  experimental  conditions,  see  Minton  et  al.,  1986). 
Approximately  50  template  were  then  sequenced  by  the  dideoxynucleotide  method  of  Sanger  et 
al  (1980)  using  a  modified  version  of  bacteriophage  T7  DNA  polymerase,  "sequenasc*"  (Tkbor 
and  Richardson,  1987).  Experimental  conditions  used  were  as  stated  by  the  supplier  (United 
States  Biochemical  Corp.).  The  inserts  of  other  plasmid  (eg.,  pCBB2  and  pCBB3)  were 
sequenced  using  templates  derived  by  subcloning  the  entire  region  between  the  appropriate 
sites  of  M13mpI8  and  M13mpl9.  Sequence  data  obtained  employing  universal  primer  was 
then  sequentially  extended  by  the  use  of  custom-synthesised  oligonucleotide  primers.  In 
certain  instances,  templates  were  generated  by  the  insertion  of  Oral  restriction  subfragments 
into  the  Smal  site  of  M13mpl8.  In  all  cases  the  sequence  was  determined  on  both  DNA 
strands.  On  some  occa.sions  PCR  amplified  DNA  was  cloned  directly  into  either  pCRlOOO  or 
ddT-tailed,  Smal  cut  M13mp8  (prepared  by  incubating  Smal  cut  DNA  with  terminal 
transferase  in  the  presence  of  dideoxy  TTP),  and  the  resultant  plasmid/  template  sequenced 
with  universal  primer.  DNA  sequence  data  was  analysed  using  the  computer  software  of 
DNASTAR  Inc. 
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Amplification  of  DNA  by  PCR. 


Amplification  of  C  botulinum  DNA  was  undertaken  by  polymerase  chain  reaction  (PCR), 
using  an  M  J  Research  Inc.  Thermal  cycler.  Reaction  mixtures  comprised,  10  mM  Tris-HCl, 
50  mM  KCl,  3  mM  MgCl^,  0.1  mM  dNTP,  30  nmol  of  each  primer,  2.5  units  of  Taq 
polymerase,  and  10  ng  of  C.  botulinum  genomic  DNA,  in  a  final  volume  of  0.1  ml. 
Amplification  was  for  30  cycles,  as  follows;  1.5  min  at  93°C;  3  min  at  37®,  and;  3  min  at 
72°C.  For  inverse  PCR,  140  ng  of  chromosomal  DNA,  cleaved  with  an  appropriate  restriction 
endonuclease,  was  ligated  overnight  at  14®C  in  a  50  p\  volume  and  a  10  /xl  portion  of  the 
resultant  concatenated  DNA  used  in  PCR. 


1.2  CLONING/ SEQt2NCING  OF  THE  BoNT/E  GENE 


1.2,1  Summary 

The  entire  structural  gene  of  the  Clostridium  botulinum  NCTC  11219  type  E  neurotoxin 
gene  has  been  cloned  as  5  overlapping  DNA  fragments,  generated  by  PCR.  Analysis  of 
triplicate  clones  of  each  fragment,  derived  from  3  independent  PCR's,  has  allowed  the 
derivation  of  the  entire  nucleotide  sequence  of  the  BoNT/E  gene.  Translation  of  the  sequence 
has  shown  BoNT/E  to  consist  of  1252  amino  acids,  and  as  such  represents  the  smallest  BoNT 
characterised  to  date.  The  L  chain  of  the  toxin  exhibits  the  highest  level  of  sequence  similarity 
to  TeTx  (40%).  The  L  chains  of  BoNT/A  and  BoNT/D  share  33%  similarity  with  BoNT/E, 
while  BoNT/C  exhibits  32%  similarity.  In  contrast,  the  TeTx  H  chain  exhibits  the  lowest 
degree  of  homology  (35%)  with  BoNT/E,  with  the  BoNT  H  chains  sharing  46%,  36%  and 
37%,  for  the  type  A,  C  and  D  neurotoxin  types,  respectively.  Comparisons  with  partial  amino 
acid  sequences  of  the  L  chain  of  BoNT/E  from  C  botulinum  strain  Beluga  and  that  from  the 
strains  Mashike,  Iwanai  and  Otaru,  indicate  single  amino  acid  differences  in  each  case. 
Alignment  of  all  characterised  neurotoxins  sequences  (BoNT/ A,  BoNT/C,  BoNT/D,  BoNT/E 
and  TeTx)  shows  them  to  be  composed  of  highly  conserved  amino  acid  domains  interspersed 
with  amino  acid  tracts  exhibiting  little  overall  similarity.  The  most  divergent  region 
corresponds  to  the  extren...  COOH-terminus  of  each  toxin,  which  may  reflect  differences  in 
specificity  of  binding  to  neurone  acceptor  sites. 


1.2.2  Results  and  Dlseii-ssion 
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Probing  with  type  A  neurotoxin  gene  subfragments 


Tb  identify  specific  restriction  fragments  encoding  principally  L  or  H  chain  we  initially 
sought  to  exploit  DNA  homology  between  the  previously  cloned  BoNT/A  gene  (Thompson  et 
al.,  1990)  and  the  BoNT/E  gene.  Two  restriction  fragments  were  gel-purified  from  the 
BoNT/A  gene.  The  first,  a  389  bp  Hpal-Xholl  fragment,  encoded  amino  acids  216  through 
346  of  the  BoNT/A  L  chain.  The  second,  a  628  bp  HaeJR-HindW.  fragment,  coded  for  amino 
acids  526  through  736  of  the  H  chain  (Thompson  et  al.,  1990).  Both  fragments  were 
radiolabelled  and  used  in  Southern  blot  experiments,  employing  type  E  genomic  DNA  cleaved 
with  various  restriction  enzymes.  Under  aqueous  conditions,  it  was  established  that 
hybridisation  between  the  two  genes  occurred  at  50°C  in  tlie  case  of  the  L  chain  probe,  and  at 
53°C  in  the  case  of  the  H  chain  probe.  The  relatively  low  value  of  these  figures  was  indicative 
of  a  fairly  low  level  of  homology  between  the  genes  in  the  regions  probed,  and,  furthermore, 
suggested  that  homology  was  greater  in  the  H  chain  encoding  region. 

Further  experiments,  in  which  the  genomic  DNA  hybridised  had  been  cleaved  with  a 
combination  of  endonucleases,  allowed  the  derivation  of  crude  restriction  maps  of  the  regions 
of  the  type  E  genome  homologous  to  the  type  A  probes  employed  (data  not  shown).  Inex¬ 
plicably,  the  two  sets  of  results  obtained  could  not  be  merged  into  a  single  unifying  restriction 
map.  This  anomaly  in  the  derived  data  meant  that  the  coding  potential  of  any  particular 
fragment,  with  regard  to  the  BoNT/E  gene,  could  not  be  confidently  assigned.  A  different 
route  to  cloning  was  therefore  adopted. 


Cloning  of  the  L  chain  encoding  region  by  PCR 

By  reference  to  published  amino  acid  sequences  of  the  NH,-terminus  of  the  BoNT/E  H 
and  L  chains  (Sathyamoorthy  and  Dasgupta,  1985;  Schmidt  et  ai.,  1985),  two  oligonucleotides 
were  synthesised  (primers  LEI  and  HEi,  Table  1)  which  would  allow  amplification  of 
essentially  the  entire  L  chain  encoding  region  by  polymerase  chain  reaction  (PCR).  The 
nucleotides  in  positions  of  codon  degeneracy  were  chosen  on  the  basis  of  those  most  commonly 
found  in  clostridial  genes  (Young  et  al.,  1989).  PCR  was  undertaken  with  LEI  and  HEI  and 
type  E  chromosomal  DNA,  at  various  temperatures,  in  buffer  containing  Mg^^  at  final 
concentrations  of  either,  1.5,  2.2  or  3.0  mM  .  Agarose  gel  electrophoresis  of  the  reaction 
products  indicated  that  no  specific  DNA  fragment  had  been  generated.  Previous  comparative 
alignment  of  the  BoNT/A  and  TeTx  L  chains  (Thompson  et  al.,  1990)  had  indicated  that  very 
few  amino  acids  were  absolutely  conserved.  One  notable  exception  was  a  centrally  located 
histidine  rich  motif.  Indeed  a  preliminary  amino  acid  sequence  of  part  of  the  BoNT/E  L  chain 
confirmed  that  this  motif  was  also  present  in  BoNT/E  (WLmars  and  Notermans,  1990).  Two 
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Ikble  1.  Synthesised  oligonucleotide  primers  employed  in  PCR-amplificanon  of  Clostridium 
botulinum  NCTC 11219  genomic  DNA. 


OLICO 

ABILITY 

PRIMS 

TO  MUCLEOTIOE  SEQUEHCS  * 

NUCLEOTIDE  POSITION** 
IN  BoNT/S  GENE 

AMINO  ACID*= 
POSITION 

REFERENCE 

IXl 

NO 

ZN8PNYK0N 

S ' -ATAAATAGTTTTXATT ATAAAGATCC- 3 ' 

T  T 

337-263 

BoirT/l«  4-13 

8«thy«»oorthy 
•t  al.  <1985) 

1X3 

YES 

ItELZHSLHG 

5 '  -CACGAACTTATACATTCTCrACATCG-3  ^ 

T  T  A  AT 

i61-i66 

BoMT/E.  312-230 

Warnara  6 

Notarmana  (1990) 

1X3* 

YES 

HELIHStnC 

3 ' -OTCCTTOAATATGTAAGAGATGTACC-S ' 

A  A  T  TA 

SS6-I61 

BoITT/ta  313-330 

LE3 

YES 

fNYNOPVNO 

5 '  -tttaat:  aTAATG  ATCCTCT  AAATCA  -  3  ' 

T 

346-271 

Boirr/E,  7-15 

Sathyaaoorthy 
at  ai.  (1985) 

HEl 

YES 

ZCIEIHNGE 

3  '  -TATACATATCTTTATTTArrACCTCT-5  ' 

G  A 

1325-lSOl 

BoNT/t^4  3-11 

• 

Htl 

YES 

PYIGPALNI 

S ' -CCAT ATATACGACCAGCATTAAATAT- 3 ' 

T  TT  T 

2037-2062 

SoNT/A,  635-643 

Thoapaon 
at  al.  (1990) 

Ht3 

YES 

XRNBXNDEV 
a  '  -AAAAGAAATGAAAAATCCGATCAAGT-3  ' 

G  G  C  A 

2344-3369 

BoHT/A,  701-709 

■ 

HE4 

YES 

NRAMZNINKP 

5 '  - AATAAACCAATGAT AAATATAAATAAATT-  3 ' 
TC  T  AT  C  C  GG 

3461-2509 

BoNT/A,  776-667 

• 

NES 

*  YES 

NRNIfVTITN 

3 '  -rrATCTA<XTATAAACATTG'rrArrOTrr-5' 
TC  A  A  A 

3139-3217 

•ONT/Aa  1013-1031 

• 

REC 

NO 

CTKPLIKKY 

3  '  ^CTTG  nr  r AAATATT A  ri’lTTTI  AT«5 ' 

A  C  T  0  C  CA 

3597-3632 

BoNT/A,  1157-1165 

• 

HE7 

NO 

NIPIPVOOGN 

3 '  -ACCCTTAAATATGCTCATCTACTTCCAACC-  5 
TO  AAAT  TGAT 

'  3945-3974 

BoNT/A,  1273-1391 

• 

LE4 

YES 

caoTV  loar 

3 ' -<?TACCCAAACTTSTATTCCACASTA- 3 ' 

1367-1291 

ao«T/Z,  J47-J55 

thia  study 

HEA 

YES 

ivssasTK 

3 '  -TATCATACCTI AACCTACTCATTT-3  ' 

3390-3303 

BoNT/Ea  665-692 

this  atuoy 

LE9 

YES 

TFOSQfSl 

3  '  -TCAOCTCTATTACTr  AAGGTATAAC-3  ' 

563-606 

BoNT/Ea  119-136 

this  study 

LEA 

YES 

UITSIRCT 

S '  -cr  AATAACAAATATAAGAOCTAC-3  ' 

945-967 

BoNT/Sa  340-247 

this  study 

NE9 

YES 

aarsisrwva 

3 '  -nTTAAAATCATAATCAAAOACCCATTC-S  ' 

A 

3965-3993 

BoNT/Ea  913-931 

thia  study 

HEIO 

yes 

DHSsewav 

3 ' -ATAATAArrcACGATGCAAAGTAT-3 ' 

3061-3064 

Boirr/la  945-953 

thl.  .tudy 

*  the  oligonucleotide  primers  LE1-LE3  *nd  HE1-HE7  »re  'guessomers*,  designed  to  priroe/anneaJ  to  DNA  sequence 
encoding  the  amino  sequence  illustrated  above.  These  amino  sequences  were  either  derived  by  NH^-terminal 
sequencing  of  purified  BoNT/E  light-  and  heavy-chain  subunits,  or  from  the  BoNT/A  sequence,  previously 
determined  by  recombinant  means  (Thompson  et  al.,  1990).  Where  these  primers  differed  from  the  actual  DNA 
sequence  of  the  BoNT/E  gene  is  illu.stral«l  below  the  .sequence.  With  the  exception  of  LE9,  all  other  primers  are 
perfect  primers,  based  on  the  determined  BoNT/E  gene  sequence.  Primer  LE9  is  based  on  the  equivalent  region  of 
the  BoNT/F  gene,  which  differs  from  the  BoNT/E  gene  in  this  region  by  one  nucleotide  (unpublished  data). 

*  position  in  the  BoNT/E  gene  to  which  the  oligonucleotides  are  targeted.  Numbers  correspond  to  nucleotide 
positions  in  Fig. 2. 

^  niim>v*rinrt  ^4?  *1.-.  - .  ?«1...a-_a-J  .1. - *«. .  * 


further  primers  (LE2  and  LE2',  Thble  1)  were  therefore  synthesised  corresponding  to  the  sense 
and  anti-sense  DNA  strand  capable  of  encoding  the  histidine  rich  motif  of  the  BoNT/E  L 
chain.  Subsequent  PCR,  at  an  annealing  temperature  of  37“C,  using  the  primer  pairs  LEI  + 
LE2',  and  LE2  -b  HEl,  resulted  in  a  amplified  DNA  fragment  of  the  expected  size  only  in  the 
case  of  the  latter  pair.  Furthermore,  appreciable  amounts  of  DNA  were  only  generated  at  the 
highest  Mg^'*’  concentrations  employed.  These  data  suggested  that  the  failure  of  the  initial 
PCR  to  amplify  a  specific  DNA  fragment  was  due  to  inefficient  priming  of  LEI.  An 
alternative  primer  was  therefore  synthesised  (LE3,  Thble  1),  and  used  in  combination  with 
HEl  in  a  further  PCR  assay.  In  this  case  a  DNA  fragment  of  the  expected  size,  1.3  kb,  was 
evident,  following  subsequent  agarose  gel  electrophoresis  of  the  reaction  products. 

The  amplified  products  of  the  LE3  +  HEl  reaction  were  blunt-ended  with  T4  polymerase 
and  cloned  into  the  Smal  site  of  pMTL20.  Restriction  analysis  of  6  resultant  recombinant 
plasmids  indicated  the  presence  of  a  common  restriction  fragment.  Confirmation  that  the 
amplified  fragment  encoded  BoNT/E  was  obtained  by  plasmid  sequencing  a  representative 
plasmid  recombinant  (designated  pCBEl)  with  both  universal  and  reverse  primer.  Translation 
of  the  derived  DNA  sequences  resulted  in  an  uninterrupted  amino  acid  sequence,  which  in  the 
case  of  that  derived  using  universal  primer  exhibited  100%  identity  with  a  preliminary  BoNT/E 
sequence  (Wemars  and  Notermans,  1990),  while  the  sequence  derived  using  reverse  primer 
had  substantial  homology  to  the  COOH-terminus  of  the  BoNT/A  L  chain.  Having  established 
that  the  amplified  fragment  encoded  BoNT/E,  the  entire  nucleotide  sequence  of  the  pCBEl 
insert  was  determined,  as  described  in  MATERIALS  AND  METHODS. 


Cloning  of  H  chain  encoding  DNA  by  PCR 

In  parallel  to  the  experiments  described  above,  a  number  of  oligonucleotides  were 
synthesised  with  a  view  to  amplifying  DNA  regions  of  the  neurotoxin  gene  enajding  parts  of 
the  H  chain.  In  the  absence  of  amino  acid  sequence  data  for  the  BoNT/E  H  chain,  we 
reasoned  that  amino  acid  motifs  common  to  BoNT/A  and  TeTx  may  also  be  present  in 
BoNT/E.  The  synthesised  oligonucleotides  (Table  1)  therefore  corresponded  to  a  sense  or 
anti-sense  DNA  strand  capable  of  encoding  amino  acid  motifs  found  in  BoNT/A  which  were 
highly  conserved  in  TeTx  (Thompson  et  al.,  1990).  Individual  PCR's  v/ere  undertaken  with 
all  possible  combinations  of  the  sense  and  anti-sense  oligonucleotides,  under  the  conditions 
successfully  established  with  the  L  chain  primers.  The  only  pairs  of  primers  found  to  generate 
DNA  fragments  of  the  expected  size  were  HE2  -b  HE5,  HE3  +  HE5,  and  HE4  +  HE5.  As 
the  fragment  derived  from  the  reaction  involving  HE2  +  HE5  was  the  largest  (c.  1.2  kb),  this 
particular  DNA  product  was  cloned,  following  blunt-ending,  into  the  Smal  site  of  pMTL20. 
Plasmid  sequencing  of  the  resultant  recombinant,  pCBE2,  and  translation  of  the  nucleotide 
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sequence  obtained  established  the  presence  of  uninterrupted  amino  acid  sequences  exhibiting 
significant  homology  to  the  BoNT/A  H  chain.  Thereafter,  the  complete  nucleotide  sequence  of 
the  insert  of  pCBE2  was  determined  (see  MATERIALS  AND  METHODS). 
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Fig.  1.  BoNT/E  gene  cloning  strategy.  The  5  PCR-amplified  regions  of  NCTC  11219 
chromosonie,  that  were  cloned  iti  the  recombinant  plasmids  pCBEl-5,  are  represented  by  open 
boxes  below  the  restriction  map  of  the  region  of  the  genome  encoding  the  BoNT/E  gene.  LE 
and  HE  primer  sequences  are  given  in  Table  1.  The  arrows  indicate  the  direction  of 
DNA  synthesis;  solid  arrows  are  perfect  primers,  open  arrows  guessomcrs.  The  vertical  dotted 
line  identifies  the  boundaries  of  the  concatenated  restriction  fragment  employed  as  the  substrate 
for  inverse  PCR,  using  primer  pairs  LE5  +  LE6  and  LE9  +  I  E  10. 


Cloning  of  the  remainder  of  the  BoNT/E  gene 

To  clone  the  intervening  BoNT/E  DNA  between  the  inserts  of  pCBEl  and  pCBE2,  two 
oligonucleotides  primers  (LE4  and  HE8;  Table  1)  were  synthesised  based  on  the  determined 
nucleotide  .sequences  of  the  pCBEl/2  inserts.  The  1.03  kb  product  generated  in  a  PCR  using 
these  primers  was  cloned  directly  into  the  specialised  cloning  vector  pCRlOOO,  and  the 
nucleotide  sequences  of  the  inserts  of  a  representative  clone,  pCBE3,  determined. 


DNA  fragments  carrying  the  remaining  3'  and  5'  ends  of  the  BoNT/E  gene  were  generated 
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by  inverse  PCR.  This  strategy  required  the  identification  of  restriction  sites  proximal  and 
distal  to  the  gene.  These  sites  were  mapped  by  employing  the  radiolabelled  PCR  products 
generated  by  LE2  +  HI  and  HE2  +  HE5,  in  Southern  blot  experiments  with  restricted  type  E 
chromosomal  DNA.  The  data  obtained,  together  with  information  available  from  the 
nucleotide  sequences  of  the  inserts  of  pCBEl  and  pCBE2,  was  used  to  construct  an  accurate 
restriction  map  of  the  region  of  the  type  E  genome  encoding  the  BoNT/E  gene  (Figure  1). 
This  indicated  that  the  5'  end  of  the  structural  gene  resided  on  a  c  1.0  kb  EcoRI  fragment,  and 
that  the  3'  end  of  the  gene  was  encompassed  by  a  1.1  kb  Ddel  fragment.  Accordingly,  type  E 
chromosomal  DNA  was  cleaved  with  the  appropriate  enzyme,  self-ligated  and  a  PCR 
undertaken  on  the  circularised  products  using  the  oligonucleotide  primer  pairs  LE5  -h  LE6  in 
the  case  of  EcoRI  cleaved  DNA,  and  HE9  +  HEIO  in  the  case  of  DNA  cut  with  Ddel.  In  both 
cases,  DNA  fragments  of  the  calculated  size  were  shown  to  be  generated.  Each  amplified 
DNA  product  was  cloned  directly  into  pCRlOOO  (Mead  et  al.,  1991),  yielding  pCBE4  (3'-end) 
and  pCBE5  (5'-end),  and  the  entire  nucleotide  sequences  of  their  inserts  determined 
(MATERIALS  AND  METHODS). 


The  complete  nucleotide  sequence  of  the  BoNT/E  gene 

The  5  overlapping  nucleotide  sequences  derived  from  the  inserts  of  pCBEl  to  pCBE5  in 
total  encompassed  the  entire  BoNT/E  structural  gene.  However,  because  Taq  polymerase  is 
known  to  misincorporate  nucleotides  during  DNA  synthesis  (Eckert  and  Kunkel,  1991),  the 
sequence  obtained  may  not  have  represented  the  authentic  BoNT/E  sequence.  Therefore,  all  5 
cloned  DNA  fragments  were  reamplified  by  PCR,  and  cloned  to  give  duplicate  isolates  of  the 
five  plasmids,  p)CBEl  to  pCBE5.  The  nucleotide  sequences  of  the  entire  inserts  of  each  new 
plasmid  were  determined  and  compared  to  that  derived  from  the  initial  clones.  In  those  cases 
where  a  discrepancy  in  sequence  was  apparent,  the  appropriate  fragment  was  PCR-amplified 
and  cloned  to  give  a  third  pCBE  clone.  The  relevant  region  of  *he  insert  of  this  plasmid  was 
then  determined,  and  the  consensus  of  the  3  sequences  taken  as  being  the  correct  BoNT/E  gene 
sequence.  The  number  of  discrepancies  in  the  three  sequences  was  surprisingly  high,  with  a 
total  of  7  PCR-induced  substitutions  and  2  single  base  additions.  Both  of  the  latter,  occurred 
in  regions  of  the  sequence  composed  of  at  least  5  consecutive  'A'  nucleotides.  This  error  rate 
equates  to  7.8  KL*  per  nt  (ie.,  9  errors  per  1 1500  bases)  and  is  most  probably  a  direct  result 
of  the  relatively  high  level  of  Mg^'^  employed  (Eckert  and  Kunkel,  1991). 

The  final  sequence  derived  is  illustrated  in  Fig.  2.  The  BoNT/E  gene  has  a  75%  A+T 
content  and  is  composed  of  1253  codons,  initiating  at  nucleotide  position  228  with  a  AUG 
codon  and  terminating  at  position  3986  with  a  UAA  stop  codon.  The  use  of  these  particular 
translational  initiation  and  termination  signals  is  a  general  characteristic  of  clostridial  genes 
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EcoAI 

hAATTCAACTAGTAGATAATAAAA>TAATCCACAGAUTTTATT*TTAATAATMTATATTTATCTCTAACTGTTTAACTTTAACTTArAACAATGTAAA  100 
9  tc 

-10  i  *1 

TGTATATTTGTCTATAAAAAATCAAGATTACAATTGGGTTATATGTGATCTTAATCATGATATACCAAAAMGTCATATCTATCGAIATTAAAAAATATA  200 

a 

_  MPKIMSFIIYMDPVM  IRTILTIKPG 

TAAATTTAAAATTAGGAGATGCTGTATATGCCAAAAATTAATAGTITTAATrATAATGATCCTGTTAATCATACAACAATnTATATATTAAACCAGGCG  300 


GCQEFYKSFNIMKIIIUIIPEIIIIVIGTTPODFHPP 
GTTGTCAAGAATTTTATAAATCATTTAATATTATGAAAAATATTTGGATAATTCCAGAGACAAATGTAATTGGTa'-JUCCCCCCAAGATTTTCATCCGCC  400 

TSLKMGOSSYYOPNYLQSOEEICORFLriVTKIF 
TAnTCATTAAAAAATGGAGATAGTAGTTATTATGACCCTAATTArTTACAAAGTGATGAAGAAAAGGATAGATTTTTAAAAATACTCACAAAAATATTT  500 

NRINNNLSCGILLEELSKANPYLGNDNTPDNOF 
AATAGAATAAAIAATAATCTTTCAGCAGGGATTTTATTAGAAGAACTGTCAAAAGCTAATCCATATTTAGGGAATGATAATAacaCATAATCAATTCC  600 

NlGDASAVElKFSNGSODILLPNVItMCAEPDLF 
ATATTGCTGATGCATCAGCAGTTGAGATTAAATTCTCAAATGGTAGCCAAGACaTACTATTACCTAATGTTATTATAATGGGAGCAGAGCCTGATTTATT  700 

r 

ETNSSkISLRNNYMPSNHGFGSIAIVTFSPEYS 

TGAAACTAACAGTTCCAATATTTCTCTAAGAAATAATTATATGCCAAGCAATCACGGTTTTGCATCAATAGCTATAGTAACATTaaCCTGAATATTCT  800 

c 

FRFUDRSHREFIOOPALTLIIHELIHSIHGLYGA 
TTTAGATTTAATGATAATAGTATG«ATGAATTTATTCAAGATLCTGCTCTTACAITAATGCATGAATTAAfACATTCATTACATGGACTATATGGGGCTA  900 

M 

KGITTCYTITQIcaRPLITHIRGTIlIEEFtTFGGT 
AAGGGATTACTACAAAGTATACTATAACACAAAAACAAAATCCCCTAArAACAAATATAAGAGGTACAAATATTGAAGAATTCTTAACTTTTGGAGGTAC  1000 
T  EcoRI 

OLRIITSAOSHDIYTMLLAOYKKIASICISICVQV 
TGATTTAAACATTATTACTAGTGCTCAGTCCAATGATATCTATACTAATCTTCTAGCTGATTATAAAAAAATAGCGTCTAAAOTAGCAAAGTACAAGTA  1100 

SMPLL«PY«OVFEAKYGk.DKDASGIY$V»IMKF 
TCTAATCCACTACTTAATCCTTATAAAGAIGTTTTTGAAGCAAAGTATGGATTAGATAAAGAIGCTAGCr.GAATTTATTCGGTAAATA’'AAACAAATTTA  1200 

9DIFKKLYSFTEFOIATKFOVKCROTYISOYICYF 
ATGATATTTTTAAAAAATTATACAGCTTTACGGAATTTGATTTAGCAACTAAATTTCAAGTTAAATGTAGGCAAACTTATATKKACAGTATAAATACTT  1300 

KLSHLLNDSITNISEGYNINNIKVMFRGONANL 
CAAACTTTCAAACTTGTTAAATGATTCTATTTATAATATAICAGAAGGCTATAATATAAATAATTTAAAGCTAAATTTTAGACGAaGAA’GCAAATTTA  1400 

ClUIN 

MPRIITPITGRGLVKKIIRFCKMIVSVKCtRKS 
AATCCTAGAATTATTACACCAATTACAGGTAGAGGACTAGTAAAAAAAATC/TTAGATTTTGTAAAAATATTGTTTCTGTAAAAGCCATAAGGAAATCAA  1500 

ICIEIMNGELFFVASEN3YH00NINTPKEI0DTV 
TATGTATCGAAAIAAATAATGGTGAGTTATTTTTTGTGGCTTCCGAGAArAGTIATAATGATGATAATATAAATACTCCTAAAGAAATTGACGATACAGT  1600 

rSNNNYENOLOOVILHFMSESAPGLSDEKLHLT 
AAaTCAAATAATAATTATGAAAATGATTTACy  TCAGGTTATTTTAAATTTTAATACTGAATCAGCACCTGGAC7TTCAGATGAAAAATTAAATTTAACT  1700 

IQNDAYIPKYO^NCTSOIEQHDVNELNVFFYLD 
ATCCAAAATGATGCTTATATACCAAAATATGATTtTAATGGAACAAGTGATATAGAACAACATGATGTTAATGAACTTAATGTAITTITCTATTTAGATG  1800 

AOrVPEGEI(NVNLTSSIOTAll.EOPItITTFFSSE 
CACAGAAAGTGCCCGAAGGTGAAAATAATGTCAATCTCACCTCTTCAATTGATACAGCATTATTAGAACAACCTAAAATATATACATTTTTTTCATCAGA  1900 

FIMRVUKPVOAALFVSUlOOVLVOFTTEAUPItS 
ATTTATTAATAATGTCAAIAAACCTGTGCAACaGCATTATTTGTAAGCTGGATACAACAAGTGTTAGTAGATTTTACTACTCAAGCTAACCAAAAAAGT  2000 

TVDKlAOISIVVPYIGLAtMIGREAORGRFKDA 
ACTGTTGATAAAATTGCAGATATTTCTATAGT.uTTCCATATATAGGTCTTGCTTTAAATATAGCAAATGAAGCACAAAAAGGAAATTTTAAAGATGCAC  2100 

LELLGAGILLEFEPELLIPTIIVFIIICSFLGSSD 
TTGAAITATTAGGAGCAGGTATTTTATTAGAATTTCAACCCGAGCTTTTAATTCCTACAATTTTAGTATTCACCATAAAATCTTniTAGCTTCATCTGA  2200 

MKNKVIKAINNALKEROEKUREVYSFIVSNWMT 
TAATAAAAATAAAGTTATTAAAGCAATAAATAATGCATTGAAAGAAAGAGATGAAAAATGGAAAGAAGTATATAGTTTTATAGTATCGAATTGGATGACT  2300 


Fig  2.  Compute  nucUotuU  sequence  of  the  type  E  gene.  The  BoNT/E  axnino  acid  sequence  is  given 
in  the  single  letter  code  above  the  centra]  nucleotide  of  the  corresponding  codon.  Differences  between 
the  NCTC  11219  sequence  and  the  partial  nucleotide  sequences  of  the  genes  of  strain  Beluga  and 
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(CIUTOFHKRKEOHYQALOIIOVMAIICTIIESICYH 
AAAATTAATACACAATTTAATAAAAGAAAAGAACAAATCTATCAAGCTTTACAAAATCAACTAAATGCAATTAAAACAATAATACAATCTAAGTATAATA  2400 

SVTLEEKNELTMKYOIKQlEMELNQtCVSIANMM] 
6TTATACT7TAGAGGAAAAAAATGAGCTTACAAATAAATATGATATTAAGCA.\ATACAAAATGAACTTAATCAAAAGCTTTCTATAGCAATGAATAATAT  2500 

ORFLTESSISYLHItLIHEVItllllCLREYOEIIVItT 
AGACAGGTTCTTAACTGAAAGTTCTATATCCTATTTAATGAAATTAATAAATGAAGTAAAAATTAATAAATTAAGAGAATATGATGAGAATGTCAAAACG  2600 

TLLNYIIOHGSILGESOOELMSKVTOTLMNSIP 

TATTTATTCAATTATATTATACAACATGGATCAATCTTGGGAGAGAGTCAGCAACAACTAAATTCTATCGTAACIGATACCCTAAATAATAGTATTCCTT  2700 

FKLSSYTDOeiLISYFMKFFXRIKSSSVLMMRYK 
TTAAGaTTCrTCTTATACAGATGATAAAAnTTAATTTCATATTTT\ATAAATTCTTTAAGAGAATTAAAAGTAGTTCAGTTTTAAATATCAGATATAA  2800 

RDKYVDTSGYDSRIRIRGDVYKYPTRKROFGIY 
AAATGATAAATACGTAGATACTTCAGGATATGATTCAAATATAAATATTAATGGAGATGTATATAAATATCCAACTAATAAAAATCAATTTGGAATATAT  2900 

ROXLSEVNISQRDYIIYORKYKRFSISFUVRIP 
AATGATAAACTTAGTGAAGTTAATATATCTCAAAAIGATTACATTATATATGATAATAAATATAAAAATTTTAGTATTAGTTTTTGGGTAAGAATTCCTA  3000 
Odel 

HYDNKIVMVNHEYTI1MCHRDNMSCWKV$LNHNE 
ACTATGATAATAACATAGTAAATGTTAATAATGAATACACTATAATAAATTCTATGAGAGATAATAATTCACCATCGAAAGTATCTCTTAATCATAATGA  3100 

IIUTLQDMACIMQKLAFNYGNANGISDYIMKWI 
AATAATTTGGACATTGCAAGATAATGCAGGAATTAATCAAAAATTAGCATTTAACTATGGTAACGCAAATGGTATTTCTGATTATATAAATAAGTGGATT  3200 

FVTITNORLGOSKLYtHGMLIDQXSILNLGNIH 
TTTGTAACTATAACTAATGATAGATTAGGAGATTCTAAACTTTATATTAATGGAAATTTAATAGATCAAAAATCAATTTTAAATTTAGGTAATATTCATO  3300 

VSOMTLFXIVRCSYTRYICIRYFMIFOKELOETE 
TTAGTGACAATATATTATTTAAAATAGTTAAITGTAGITATACAAGATATATTCGTATTAGATATTTTAATAnTTTGATAAAGAATTAGATGAAACAGA  3400 

iqtlysnepmtnilkdfvgnyllydkeyyllnv 

AATTCAAACTTTATATAGCAATGAACCTAAIACAAATATTTTGAAGGATTTTTGGGGAAATTATTTGCTTTATCACAAAGAATACTATTTATTAAATGTG  3500 

LKPRRFIORRKOSTUSIRRIRSTILLARRLYSO 
TTAAAACCAAATAACTTTATTGATAGGAGAAAAGATTCTACTTTAAGCATTAATAATATAAGA,AGCACTATTCTTTTAGCTAATAGATTATATAGTGGAA  3600 

IKVKIQRVRRSSTRORLVRKROOVYIRFVASXTH 
TAAAAG/TAAAATACAAAGAG7TAATAATAGTAGTACTAACGATAATCTTGTTAGAAAGAATGATCAGGTATATATTAATTTTGTAGCCAGCAAAACTCA  3700 

IfPLYAOTATTRKEKTIXlSSSGRRFROVVVHR 
C7TATTTCCATTATATGCTGATACAGCTACCACAAATAAAGAGAAAACAATAAAAATATCATCATCTGCCAATAGATTTAATCAAGTA0TAGTTATGAAT  3800 

SVGNNCTNMFKNNNGNNIGLLGFKAOTVVASTU 

TCAGTAGGAAATAATTGTACAATGAATTTTAAAAATAATAATGCIAATAATATTGGGTTGTTAGGTTTCAACGCAGATACTCTAGTTGCTAGTACTTGGT  3900 

YYTHHRDHTRSRGCFUNFtSEEHGWOEC 

ATTATACACATATGAGAGATCATACAAACAGCAATGGATGTTTTTGGAACTTTATTTCTGAAGAACATGGATGGCAAGAAAAATAAAAATTACATTAAAC  4000 

GGCTAAAGTCATAAATTCCAAAGGACTTAG  4030 
Ddel 


strains  Ma^hike,  Iwanai  and  Otarj,  are  indicated  below  the  appropriate  position  of  the  sequence,  in 
lower  and  upper  case  letters,  respectively.  An  upward  facing  arrow  indicates  an  insertion.  Any  change 
in  the  encoded  amino  acid  is  indicated  above  the  NCTC  BcNT/E  amino  acid  sequence.  The  putative 
-10  promoter  region  (based  on  homology  to  the  BoNT/A  gene  5'  non-coding  region)  and 
transcriptional  initiation  site  are  marked  by  a  dashed  line  above  the  sequence  and  downward  facing 
arrow,  respectively.  The  ribosome  binding  site  is  indicated  by  a  line  above  and  below  the  sequence. 


(Young  et  al.,  1989).  The  AUG  codon  is  preceded  by  a  sequence  typical  of  clostridial 
ribosome  binding  sites,  in  both  its  composition  and  distance  (8  bases)  from  the  AUG  initiation 
codon.  The  codon  usage  exhibited  by  the  gene  is  also  typical  of  clostridial  genes,  with  an 
extreme  bias  for  codons  ending  in  A  and  T,  and  the  frequent  use  of  codons  recognised  as 
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modulators  of  translation  in  E.  coli.  Although  a  number  of  sequences  5'  to  the  BoNT/E 
structural  gene  exhibit  some  similarity  to  procaryotic  promoter  elements,  assignment  of  such 
sequences  as  transcriptional  signals  will  require  appropriate  experimental  data.  A  reasonably 
high  degree  of  sequence  similarity  (77.4%  identity)  does,  however,  exist  between  the  5'  non¬ 
coding  region  of  the  BoNT/A  (Binz  et  al.,  1990)  and  BoNT/E  gene.  Based  on  this  homology, 
the  transcriptional  start  point  would  be  nucleotide  117,  and  the  TATATT  motif  at  position  103 
to  108  the  putative  -10  promoter  element  (Figure  2). 

The  sequence  of  a  983  bp  portion  of  the  BoNT/E  gene  (equivalent  to  nucleotides  1  to  988 
of  Fig.  2)  ,  encoding  part  of  the  L  chain,  from  a  number  of  other  C.  botulinum  type  E  strains 
has  been  reported,  namely  strain  Beluga  (Binz  et  al.,  1990)  and  strains  Mashike,  Iwanai  and 
Otaru  (Fujii  et  al,,  1990).  The  sequences  derived  from  the  latter  3  strains  were  identical  and 
differ  from  that  reported  here  for  strain  NCTC  11219  by  a  single  nucleotide  at  position  916. 
Thus  codon  230  of  the  BoNT/E  genes  from  strains  Mashike,  Iwanai  and  Otaru  is  UAG,  while 
in  the  BoNT/E  gene  of  strain  NCTC  11219,  this  codon  is  AAG.  In  contrast,  the  sequence 
derived  from  strain  Beluga  exhibits  4  nucleotide  differences  to  the  sequence  of  NCTC  11219. 
Three  of  these  changes  occur  in  the  5'  non-coding  region,  including  a  single  base  'C  insertion 
in  the  Beluga  sequence  (see  Fig.  2),  while  the  fourth  difference  results  in  a  codon  alteration  of 
CGT  (Beluga)  to  GGT  (NCTC  1 1219)  at  position  756  (Fig.  2). 

Comparative  alignment  of  the  nucleotide  sequence  of  the  two  regions  of  the  BoNT/A  gene 
used  as  DNA  probes  in  our  original  experiments,  to  the  equivalent  region  of  the  BoNT/E  gene, 
confirmed  that  a  greater  degree  of  DNA  homology  occurred  between  the  H  chain  probe  than 
the  L  chain  probe.  Thus,  the  389  bp  BoNT/A  Hpal-XhoU  fragment  exhibited  61.7% 
homology  to  the  BoNT/E  gene,  whereas  the  628  bp  Tfoelll-Z/indlll  fragment  demonstrated 
67,3%  homology  with  the  BoNT/E  gene.  The  attainment  of  the  complete  nucleotide  sequence 
of  the  BoNT/E  gene  also  provided  an  opportunity  to  assess  the  reasons  for  the  apparent 
ability/inability  of  the  synthesised  oligonucleotides  to  act  as  primen  in  PCR  (Thble  1).  Such 
an  assessment  did  not  prove  particularly  informative.  Thus,  although  the  presence  of  7 
sequence  mismatches  in  the  case  of  HE6  may  have  precluded  annealing  to  BoNT/E  genomic 
DNA,  9  sequence  mismatches  in  oligonucleotide  HE4  apparently  did  not  effect  its  ability  to 
prime  in  PCR,  assuming  the  generated  fragment  was  indeed  the  region  targeted.  The  success 
of  primer  HE4  may  have  been  due  to  the  fact  that  the  4  mismatches  at  the  3'  end  of  the 
oligonucleotide  would  all  have  resulted  in  neutral  d(G  T)  pairing.  More  difficult  to  explain 
was  the  inability  of  LEI  to  act  as  a  primer  (  only  2  mismatches). 
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The  complete  amino  acid  sequence  of  the  BoNT/E  gene 


The  deduced  amino  acid  sequence  of  BoNT/E  demonstrated  that  the  neurotoxin  is 
comprised  of  1252  residues,  making  it  the  smallest  neurotoxin  yet  characterised.  The  amino 
acids  at  positions  423  through  435  demonstrated  perfect  agreement  with  those  determined 
experimentally  by  NH^-terminal  sequencing  of  the  purified  BoNT/E  H  chain  (Sathyamoorthy 
and  Dasgupta,  1985;  Schmidt  et  al.,  1985).  A  more  extensive  recent  sequence  had  indicated  a 
presence  of  a  single  unassigned  amino  acid  ("X")  at  BoNT/Ej^  positions  16  and  19  (Dasgupta 
and  Datta,  1988).  The  sequence  deduced  here  indicates  that  the  first  "X"  equates  to  the 
dipeptide  sequence  Ala-Ser,  while  the  second  "X"  is  a  Ser  residue.  Comparisons  between  the 
NCTC  11219  L  chain  and  the  partial  amino  acid  sequences  of  the  BoNT/E  L  chains  of  strain 
Beluga  and  strains  Mashike,  Iwanai  and  Otaru,  indicated  a  single  amino  acid  difference  in  each 
case.  Thus,  the  Gly  residue  at  position  177  in  the  NCTC  11219  toxin  has  been  replaced  by 
Arg  in  Beluga  BoNT/E,  while  the  Lys  amino  acid  at  position  230  in  the  NCTC  11219  BoNT/E 
is  Met  in  the  equivalent  position  of  the  three  Japanese  strain-derived  toxins. 


1.3  CLONING/  SEQUENCING  OF  THE  BoNT/B  GENE 


1.3.1  Summary 

DNA  fragments  derived  from  the  Clostridium  botulinum  type  A  neurotoxin  (BoNT/A)  gene 
(botA)  were  used  in  DNA/DNA  hybridisation  reactions  to  derive  a  restriction  map  of  the 
region  of  the  C.  botulinum  type  B  strain  Danish  chromosome  encoding  botB.  As  the  one  probe 
encoded  part  of  the  BoNT/A  heavy  (H)  chain,  and  the  other  part  of  the  light  (L)  chain,  the 
position  and  orientation  of  botB  relative  to  this  map  was  established.  The  temperature  at  which 
hybridisation  occurred  indicated  that  a  higher  degree  of  DNA  homology  occurred  between  the 
two  genes  in  the  H  chain  encoding  region.  Using  the  derived  restriction  map  data,  a  2. 1  kb 
Bglll-Xbal  fragment  encoding  the  entire  BoNT/B  L  chain  and  108  amino  acids  of  the  H  chain 
was  cloned  and  characterised  by  nucleotide  sequencing.  A  contiguous  1.8  kb  Xbal  fragment 
encoding  a  further  623  amino  acids  of  the  H  chain  was  also  cloned.  The  3'-end  of  the  gene 
was  obtained  by  cloning  a  1.6  kb  fragment  amplified  from  genomic  DNA  by  inverse 
polymerase  chain  reaction.  Translation  of  the  nucleotide  sequence  derived  from  all  three 
clones  demonstrated  that  BoNT/B  was  composed  of  1291  amino  acids.  Comparative  alignment 
of  its  sequence  with  all  currently  characterised  BoNT's  (A,  C,  D,  E)  and  tetanus  (TeTx) 
showed  that  a  wide  variation  in  percentage  homology  occurred  dependent  on  which  component 
of  the  dichain  was  compared.  Thus,  the  L  chain  of  BoNT/B  exhibits  the  greatest  degree  of 
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homology  (50%  identity)  with  the  TeTx  L  chain,  whereas  it's  H  chain  is  most  homologous 
(48%  identity)  with  the  BoNT/A  H  chain.  Overall,  the  6  neurotoxins  were  shown  to  be 
composed  of  highly  conserved  amino  acid  domains  interceded  with  ammo  acid  tracts  exhibiting 
little  overall  similarity.  In  total  68  amino  acids,  out  of  an  average  of  442,  are  absolutely 
conserved  between  L  chains  and  110,  out  of  845  amino  acids,  between  H  chains. 
Conservation  of  Trp  residues  (1  in  the  L  chain,  and  9  in  the  H  chain)  was  particularly  striking. 
The  most  divergent  region  corresponds  to  the  extreme  carboxyterminus  of  each  toxin,  which 
may  reflect  differences  in  specificity  of  binding  to  neurone  acceptor  sites. 


1.3.2  Results  and  Discussion 


Southern  blot  analysis  of  the  botB  gene 

A  389  bp  Hpal-XhoU  botA  fragment,  encoding  amino  acids  216  through  346  of  the 
BoNT/A  L  chain,  and  a  628  bp  HaeYl-HindWl  fragment,  coding  for  amino  acids  526  through 
736  of  the  H  chain  (Thompson  et  al.,  1990),  were  radiolabelled  and  used  in  DNA/DNA 
hybridisations  with  type  B  chromosomal  DNA  cleaved  with  various  restriction  enzymes. 
Reactions  were  performed  in  aqueous  solution  over  a  range  of  temperatures.  "Weak" 
hybridisation  between  the  two  genes  was  found  to  occur  at  53°C  and  56°C  with  the  L  and  H 
chain  probes,  respectively  (data  not  shown).  The  strength  of  the  signal  observed,  and  the 
relatively  low  stringency  required  were  indicative  of  a  fairly  low  level  of  DNA  homology 
between  botA  and  botB.  Furthermore,  these  results  suggest  that  the  L  chain  encoding  regions 
of  the  two  genes  are  less  homologous  than  the  H  chain  encoding  region,  at  least  in  the  areas 
probed.  Having  established  the  conditions  at  which  hybridisation  occurred,  the  type  B 
genomic  DNA  was  cleaved  with  various  combinations  of  restriction  endonucleases  and  the 
nylon  membranes  carrying  the  resultant  fragments  sequentially  hybridised  with  the  two  probes. 
The  data  obtained  allowed  the  derivation  of  a  restriction  map  of  the  region  of  the  type  B 
genome  encoding  botB.  Furthermore  the  use  of  the  two  probes  enab'ed  the  assignment  of  both 
the  position  of  botB  and  its  relative  orientation,  with  respect  to  the  derived  map  (Fig.  3). 


Cloning  and  sequencing  of  the  botB  L  chain. 

The  restriction  map  derived  by  the  Southern  blot  experiments  (Fig.  3)  indicated  that  a  2.1 
kb  Bgtll-Xbal  fragment  principally  encoded  the  L  chain  of  BoNT/B.  To  clone  this  DNA,  and 
to  minimise  the  risk  of  cloning  contiguous  BoNT/B  encoding  regions,  the  targeted  fragment 
was  purified  by  a  two- stage  gel  isolation  procedure.  C.  botulinum  type  B  chromosomal  DNA 
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was  cleaved  with  Xbal  and  fragments  of  approximately  7.5  kb  in  size  purified  from  agarose 
gels  by  electroelution.  The  isolated  DNA  was  then  subjected  to  digestion  with  Bglll,  DNA 
fi^agments  of  around  2.1  kb  in  size  gel-purified,  ligated  to  similarly  cut  pMTL32  vector  DNA 


Xh  Hp  Ha  H 
ll^t  heavy 


BoNT  A  derived 
DNA  probes 


X  E 

I _ L 


0  1  2Kb 

I _ I _ I 


KEY: 


s 

Bgin 

D 

E 

EcoRI 

H 

Hind  III 

Ha 

ID 

Hb 

Hpa  1 

X 

Xb«t 

Xh 

Xholl 

_H  DH^OX 

B  \0  Ha  H  V/ 


Genomic  clonea 


Inverse  PCR  clone 


pC8a3 


Light 


(i  -avy 


Structural  gene 


Fig.  3.  Strategy  employed  in  the  cloning  of  the  botB  gene.  The  illustrated  restriction  map  of  the  C. 
botulinum  genome  was  generated  using  the  indicated  botA  DNA  fragments  as  probes  in  Southern  blots. 
Regions  of  the  strain  B/Danish  chromosome,  that  were  cloned  in  the  recombinant  plasmids  pCBBl  and 
pCBB2,  are  represented  by  open  boxes  below  the  restriction  map.  The  cloned  inserts  of  these  plasmids 
were  shown  to  be  contiguous  on  the  genome  by  PCR  amplification  of  the  region  of  the  chromosome 
spanning  their  common  Xbal  site,  using  primers  XI  (5'-CCAAGTGAAAATACAGAATCAC-3')  and 
X2  (B'-CCCACTTTGTCTATCATTTA-S'),  and  sequencing  across  this  junction.  The  insert  of  pCBB3 
was  derived  by  PCR  amplificaii'm  of  //i/idlll  cut,  concatenated  chromosomal  DNA  using  primers  X4 
(5'-AT-AGAGATTTATATATTGGAG-3')  and  X3  (5’-TTATATACAGCCAAATGCrCCTTGC-3') 


(Fig.  4),  and  the  resultant  TGI  transformants  screened  for  the  presence  of  recombinant  clones 
using  the  botA  L  chain  probe.  The  vector  pMTL32  was  specifically  constructed  for  the 
purposes  of  cloning  the  botB  DNA  (see  Fig.  4).  Based  on  the  pMTLl(X}3  backbone  (Brehm  et 
al.,  1991),  it  carries  multiple  cloning  sites  flanked  on  either  side  by  tandem  copies  of 
transcriptional  terminators.  Heterologous  genes  inserted  into  the  multiple  cloning  sites  will 
therefore  only  be  expressed  if  they  carry  indigenous  transcriptional  elements  recognised  by  the 
RNA  polymerases  of  E.  coli. 
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Fig.  4.  The  cloning  vector  pMTL32.  This  plasmid  was  derived  as  follows.  A  synthetic  DNA  fragment 
(5'-AGCCCGCCTAATGAGCGGGC  l  ill  i  1 1-3'),  corresponding  to  the  E.  coli  trpA  transcriptional 
terminator,  was  ligated  to  Stul  -cleaved  pMTL23  (Chambers  et  al.,  1988)  and  a  recombinant  plasmid 
selected  (pTRP23)  in  which  two  tandem  copies  of  trpA  had  been  inserted.  The  resultant  double 
terminator,  together  with  part  of  the  pMTL23  polylinker  region,  was  excised  as  a  107  bp  Nrul-EcoRl 
fragment  and  inserted  between  the  fcoRI  and  EcoKV  sites  of  plasmid  pMTL1003  (Brehm  et  al.,  1991). 

As  the  c.  350  bp  EcoRl-EcoKV  fragment  of  pMTL1003  is  deleted  during  this  manipulation,  the 
resultant  plasmid,  pMTL32,  does  not  carry  a  copy  of  the  trp  promoter. 

The  recombinant  plasmid  obtained,  designated  pCBBl,  was  shown  by  digestion  with 
appropriate  endonucleases  to  contain  restriction  enzyme  recognition  sites  consistent  with  the 
map  illustrated  in  Fig.  3.  It's  entire  insert  was  excised  by  digestion  with  BamUl  and  Bglil 
M13  recombinant  templates  confining  random  inserts  derived  using  a  sonication  procedure 
(Minton  et  al.,  1986).  Using  these  templates,  and  custom  synthesised  oligonucleotides  the 
entire  nucleotide  sequence  of  the  insert  was  determined  on  both  strands.  Translation  of  the 
resultant  sequence  indicated  the  presence  of  an  open  reading  frame  (ORF)  encoding  a 
polypeptide  of  549  amino  acids  in  size.  The  aminoterminus  of  this  polypeptide  exhibited 
perfect  conformity  to  that  experimentally  determined  for  purified  BoNT/B  L  chain 
(Sathyamoorthy  and  DasGupta,  1985).  Amino  acids  442  through  459  were  identical  to  that 


determined  for  purified  BoNT/B  H  chain  (Sathymoorthy  and  DasGupta,  1985).  Thus  tlie  insert 
carried  by  pCBBl  was  deemed  to  encode  the  entire  L  chain  of  BoNT/B  and  108  amino  acids 
from  the  H  chain. 


Cloning  and  sequencing  of  the  botB  H  chain. 

Having  established  that  the  2.1  BgHl-Xbal  fragment  encoded  the  entire  BoNT/B  L  chain 
and  the  aminoterminus  of  the  H  chain,  it  was  apparent  that  the  adjacent  1.8  kb  Xbal  fragment 
(Fig.  3)  should  encode  the  majority  of  the  remaining  H  chain.  Type  B  chromosomal  DNA  was 
cleaved  with  H/ndlll,  fragments  of  approximately  3.5  kb  isolated,  digested  with  Xbal  and 
fragments  of  around  1.8  kb  in  size  gel  purified.  The  isolated  DNA  was  ligated  with  Xbal- 
cleaved  pMTL32,  transformed  into  E.  coli  TGI  and  recombinant  plasmids  identified  by 
probing  with  the  radiolabelled  botA  H  chain  probe.  One  such  plasmid  was  designated  pCBB2, 
and  the  nucleotide  sequence  of  its  insert  determined,  following  its  insertion  in  M13mpl8,  by 
employing  custom  synthesised  oligonucleotide  primers. 

Translation  of  the  nucleotide  sequence  obtained  revealed  the  presence  of  an  continuous 
ORF  of  623  codons,  in  the  same  reading  frame  relative  to  the  Xbal  site  of  that  of  the  insert  of 
plasmid  pCBBl.  To  confirm  that  the  two  sequences  were  indeed  contiguous  a  289  bp  region 
of  DNA  encompassing  the  Xbal  site  was  amplified  from  type  B  genomic  DNA  using  the 
primers  XI  (5'-CCAAGTGAAAATACAGAATCAC-3’)  and  X2  (3’- 
CCCACTTTGTCTATCATTTA-5')  in  a  polymerase  chain  reaction  (PCR),  and  cloned  directly 
into  ddT-tailed  Smal  cut  M13mp8.  Nucleotide  sequencing  of  a  derivative  template,  using 
universal  primer,  demonstrated  that  the  inserts  of  plasmids  pCBBl  and  pCBB2  were 
contiguous  in  the  C.  botulinum  type  B  chromosome. 


Completion  of  the  botB  sequence. 

By  combining  the  two  sequences  of  pCBBl  and  pCBB2,  the  derived  contiguous  CRF 
encoded  1170  amino  acids,  indicating  that  some  120  or  so  codons  of  the  botB  gene  were 
missing.  A  DNA  region  encompassing  the  remaining  3'-end  of  the  gene  was  cloned  by  inverse 
PCR.  Type  B  chromosomal  DNA  was  cleaved  with  Hindlll,  incubated  with  T4  ligase,  and  the 
resultant  concatenated  DNA  used  as  a  template  in  PCR  with  the  oligonucleotides  X3  (5'- 
ATAGAGATTTATATATTGGAG-3')  and  X4  (5'-TTATATACAGCCAAATGCTCCTTGC- 
3').  The  1.6  kb  fragment  generated  was  cloned  directly  into  the  specialised  vector  pCRlOOO 
and  the  recombinant  plasmid  obtained  designated  pCBB3.  A  plasmid  sequence  reaction, 
undertaken  with  a  primer  previously  employed  in  the  determination  of  the  nucleotide  sequence 
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of  the  insert  of  plasmid  pCBB2,  confirmed  the  presence  of  the  botB  gene.  Thereafter  the 
nucleotide  sequence  of  the  region  of  pCBB3  encompassing  the  3 '-end  of  botB  was  determined 
by  subcloning  selected  overlapping  fragments  into  M13.  To  rule  out  the  possibility  that  the 
insert  of  pCBB3  may  have  contained  PCR-induced  errors,  a  second  version  of  this  plasmid 
recombinant  was  derived  by  cloning  the  amplified  DNA  product  from  a  further  independent 
inverse  PCR.  Nucleotide  sequencing  of  the  appropriate  regions  of  this  second  plasmid  gave  an 
identical  sequence  to  that  already  derived  from  the  primary  isolate  of  pCBB3. 


RBS  HPVTIMUFNYMOPIOM 

1  agcaatttatgccattaaaagggatataaacttaaaataaggaggasaatatttatgccagttacaataaataattttaattataatgatcctattgata 


NNI  INHEPPFARGTGRTYKAFKITORIUI  IPER 
101  ataataatattattatgatggaccctccatttgcgagaggtacggggacaiattataaagcttttaaaatcacagatcgtatttggataataccggaaag 

Hindi n 

YTFGYKPEOFHKSSGIFMROVCEYYOPDYLMTH 
201  atatacttttggatataaacctgaggattttaataaaagttccggtatttttaatagagatgtttgtgaatattatgatccagattacttaaatactaat 

OKKMI  FLOTMIKI.FNRIKSKPLGEKLLEMI  I.KGI 
301  gataaaaagaatatatttttacaaacaatgatcaagttatttaatagaatcaaatcaaaaccattgggtcaaaagttattagagatgattataaatggta 

PYLGORRVOLEEFHTHIASVTVHICLISHPGEVE 
401  TACCTTATCTTGGACATAGACGTGTTCCACTCGAAGAGTTTAACACAAACATTGCTAGTGTAACTGTTAATAAATTAATCAGTAATCCAGGAGAAGTGGA 

.IKKGtFANLIIFGPGPVLNEMETIDIGlQNHFA 
501  GCCAAAAAAAGGTATTTTCGCAAATTTAATAATATTTGCACCTGGGCCAGTTTTAAATGAAAATGAGACTATAGATATAGGTATACAAAATCATTTTGCA 

sregfggimqmkfcpeyvsvfmhvoehkgasifh 

601  TCAAGGGAAGGCTTCGGGGGTATAATGCAAATGAAGTTTTGCCCAGAATATGTAAGCCTATTTAATAATGTTCAAGAAAACAAAGGCGCAAGTATATTTA 

RRGYFSOPALlLHHELIHVtMCLTGIRVDDLPI 
701  atagacgtgcatatttttcagatccagccttgatattaatgcatgaacttatacatgttttacatggattatatggcattaaagtagatgatttaccaat 

VPHEKrFF  HOSTDAIQAEEl.  YTFGGQOPSIITP 
801  TGTACCAAATGAAAAAAAATTTTTTATCCAArCTACAGATGCTATACACGCAGAAGAACTATATACATTTCGAGGACAAGATCCCAGCATCATAACTCCT 

STDKSIYOKVLQHFRGIVORLHKVLVCISOPMIH 

901  ICTACGGA  taaaagtatctatgataaagttttgcaaaattttagagggatagttcatagacttaacaaggttttagtttgcatatcagatcctaacatta 

IHIYKMICFICOKTICFVEOSEGICYSIOVESFOICLY 
1001  atattaatatatataaaaataaatttaaagataaatataaattcgttgaagaitctgagggaaaatatagtatagatgtagaaagttttgataaattata 

KSIHFGFTETHIAEMYKIKTRASYFSOSLPPVIC 
1101  TAAAAGCTTAATGTTTGGTTTTAC  ■AAACTAAI'ATAGCAGAAAATTATAAAATAAAAACTAGAGCTTCTTATTTTAGTGAfTCCTTACCACCAGTAAAA 
Hindlll 

IKHLlOHEIYTIEEGFHISOICOMEICEYRGQHICAI 
1201  ATAAAAAATTTATTAGATAATGAAATCTATACTArAGACGAAGGGTTTAATATATCTGATAAAGATATGGAAAAAGAATATAGACGTCAGAAIAAAGCTA 

r+H-CHAIN 

HICQAYEEISICEKLAVrKIQHClCSVICAPGICItlV 
1301  taaataaacaagcttatgaagaaattagcaaggagcatttggctgtatataagatacaaatctgtaaaagtgttaaagctccacgaatatctattgatgt 
Hindi  1 1 

DHEDLFFIAOKMSFSOOLSICHERIEYHTOSHYI 

1401  tgataatgaagatttgttctttatagctgataaaaatagtttttcagatgatttatctaaaaacgaaagaatagaatataatacacagagtaattatata 

EHDFPIHELILOTOLISICIEIPSEHTESLTOFHV 

1501  gaaaatgacttccctataaatgaattaattttagatactgatttaataagtaaaatagaattaccaagtgaaaatacagaatcacttactgattttaatg 

OVPVYEKOPAIKKIFTDEHTIFQYLYSOTFPLD 

1601  tagatgttccagtatatgaaaaacaaccccctataaaaaaaatttttacagatgaaaataccatctttcaatatttatactctcagacatttcctctaga 

Xbal 

IROISLTSSFDDALLFSHKVYSFFSHDYIKTAH 
1701  TATAAGAGATATAAGTTTAACATCTTCATTTGATGATGCATTATTATTTTCTAACAAAGTTTATTCATTTTTTTCTATGGATTATATTAAAACTGCTAAT 

ICVVEAGLFAGUVKQIVHDFVIEAHICSHTHOKIAD 
1801  AAAGTGGTAGAAGCAGGATTATTTGCAGGTTGGGTGAAACAGATAGTAAATGATTTTGTAATCGAAGCTAATAAAAGCAATACTATGGATAAAATTGCAG 

ISLIVPYIGLALHVGHETAKGHFEHAFEIAGAS 
1901  ATATATCTCTAATTGTTCCTTATATAGCATTAGCTTTAAATGTAGGAAATGAAACAGCTAAAGGAAATTTTCAAAATGCTTTTGAGATTGCAGGAGCCAG 


Fig  5.  Complete  nucleotide  sequence  of  the  type  B  gene.  The  illustrated  sequence  was  derived  by 
amalgamation  of  the  derived  nucleotide  sequences  of  the  inserts  of  pCBBl  to  pCBB3  (Fig.  3).  The 


lLLEFIPEtLIPVVCAFt.LESTIOMXMKIIKTI 
2001  TATTCTACTAGAATTTATACCAGAACTTTTAATACCTGTAGTTGGAGCCTTTTTATTAGAATCATATATTGACAArAAAAATAAAATTATTAAAACAATA 

OMALTKRMEKUSOtlYGLIVAQULSTVMTaFYTIK 
2101  cataatgctttaactaaaagaaatgaaaaatggagtgatatctacggaitaatagtagcgcaatgcctctcaacagttaatactcaattttatacaataa 

EGHYKALKYOAQALEEIIKTRYMIYSEKEICSMI 
2201  AAGAGGGAATGTATAAGGCTTTAAATTATCAAGCACAAGCATTGGAAGAAATAATAAAATACAGATATAATATATATTCTGAAAAAGAAAAGTCAAATAT 

HIDFMDlHSKLHEGINaAIOMIMNFINGCSVSY 

2301  taacatcgattttaatgatataaattctaaacttaatgagggtattaaccaagctatagataatataaataattttataaatggatgttctgtatcatat 

IHKKMIPLAVEICLLOFORTLKICMI.  LMYJOEMKLY 
2401  ttaatgaaaaaaatgattccattagctgtacaaaaattactagactttgataatactctcaaaaaaaatttgttaaattatatagatgaaaataaattat 

LIGSAEYEKSrVMKYLICTIMPFDLSIYTMOTIL 
2501  AITrGATTGGAATGCAGAATATGAAAAATCAAAAGTAAATAAATACTTGAAAACCATTATCCCGTTTGATCTTTCAATATATACCAATGATACAATACT 

lEMFMKYNSEILMNllLNLRYKDHKLIDLSGYG 
2601  AATAGAAATCTTTAATAAATATAATAGCGAAATTTTAAATAATATTATCTTAAATTT.1ACATATAAGGATAATAATTTAATAGATTTATCAGGATATGGG 

AKVEVYOGVELMOICNaFKLTSSANSKIRVTaNQN 
2701  GCAAAGGTAGAGGTATATGATGGAGTCGAGCTTAATGATAAAAATCAATTTAAATTAACTAGTTCAGCAAATAGTAAGATTAGAGTGACTCAAAATCAGA 

lIFNSVFLOFSVG  FUIRIPICYANDGIONYINME 
2801  ATATCATATTTAATAGTGTGTTCCTTGATTTTAGCGTTAGGTTTTGGAIAAGAATACCTAAATATAAGAATGATGGTATAriAAATTATATTCATAATGA 

YTIINCMKMNSGUKISIRGNRIIUTLIDINCKT 
2901  ATATACAATAATTAATTGIATGAAAAATAATTCGGGCTGGAAAATATCTATTAGGGGTAATAGGATAATATGGACTTTAATTGATATAAATGGAAAAACC 

KSVFFEYN1RE0ISEY1NRVFFVTITMNLNMAK1 
3001  AAATCGGTATTTTTTGAATATAACATAAGAGAAGATATATCAGAGTATATAAATAGATGGTTTTTTGTAACTATTACTAATAATTTGAATAACGCTAAAA 

YtNGKLESNTOIKOIREVIANGEllFKLDGOlD 
3101  TTTATATTAATGGTAAGCTAGAATCAAATACAGATATTAAAGATATAAGAGAAGTTATTGCTAATGGTGAAATAATATTTAAATTACATGGTGATATAGA 

RTQFtVMKYFStFNTElSOSNIEERYKIQSYSE 
3201  tagaacacaatttatttggatgaaatatttcaotatttttaatacggaattaagtcaatcaaatattga.\gaaagatataaaattcaatcatatagc;aa 

YLKDFUGMPLMYRICEYYIlFRAGlirRSYIlCLrKOS 
3301  tatttaaaagatttttggggaaatccttt/utgtacaataaagaatattataigtttaatocggggaataaaaattcatatattaaactaaagaaagatt 

PVGEILTRSKYNONSKYtMYRDLYIGEKFIlRR 
3401  CACCTGTAGGTGAaAITTTAACACGTACCAAATATAATCAAAATTCTAAATATATAAATTATAGAGATTTATATATTGCAGAAAAATTTATTATAAGAAG 

KSNSOSIMOOIVRKEDYIYLOFFRLMOEWRVYT 

3501  aaagtcaaattctcaatctataaatgatcatatagttagaaaagaagattatatataktagatttttttaatttaaatcaagagtggagagtatatacc 

Xb«I 

YKYFKKEEEKL^LAPISOSDEFYRTtOIKEYOEQ 
3601  tataaatattttaagaaagaggaagaaaaattgtttttagctcctaiaagtgattctgatgagttttacaatactatacaaataaaagaatatgatgaac 

PTYSCOLLFKKOEESTOElGLIGtHRFYESGIV 
3701  AGCCAACATATAGTTGTCAGTTGCTTTTTAAAAAAGATGAAGA7ACTACTGATGAGAIAGGATTGATTGGTATTCATCGTTTCTACGAATCTGGAATTGT 

FEEYKOYFCISKWYLKEVrRKPTHlICLGCRUQF 

3801  atttgaagaotataaacattaittttgiataagtaaatgctacttaaaagaggtaaaaagcaaaccatataatttaaaattgggatgtaattggcagttt 

IPKOEGWTETer 

3901  attcctaaagatgaagggtggactgaataatataactatatgctcagcaaacctaitttatataagaaaagtttaagtttataaaatcttaagtttaagg 


4001  ATGTAGCTAAATTTTGAATATTAGATAAACTACATGTTT  4039 


Fig  5.  Complete  nucleotide  sequence  of  the  type  B  gene  (continue<J) 

BoNT/B  amino  acid  sequence  is  given  in  the  single  letter  code  above  the  central  nucleotide  of  the 
corresponding  codon.  The  ribosome  binding  site  is  indicated  by  a  line  above  and  below  the  sequence. 


The  entire  nucleotide  sequence  rf  the  botB  gene  (Fig.  5)  was  obtained  by  splicing  the 
individual  sequence  information  derived  from  the  inserts  of  pCBBl,  pCBB2  and  pCBB3  into  a 
contiguous  sequence.  The  gene  is  composed  of  1291  codons,  initiating  with  an  AUG  codon  at 
position  55  and  terminating  with  a  UAA  stop  codon  at  position  3928  (Fig.  5).  The  choice  of 
these  particular  translational  codons  is  typical  of  clostridial  genes  (Young  et  al.,  1989).  As 
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with  al!  other  bot  genes  characterised  to  date,  the  high  A+T  content  of  the  DNA  (74.6%) 
results  in  an  extreme  bias  towards  the  use  of  codons  ending  in  A  or  T,  and  the  frequent  use  of 
codons  recognised  as  modulators  in  E.  coli.  The  translational  start  codon  is  preceded  by  a 
sequence  typical  of  clostridial  ribosome  binding  sites  (Young  et  al.,  1989). 

Alignment  of  the  nucleotide  sequences  of  the  two  AoM-derived  DNA  probes  used  in 
Southern  blot  mapping  with  the  equivalent  regions  of  botB,  confirmed  that  the  greater  degree 
of  homology  existed  in  the  respective  H  chain  encoding  regions  over  those  encoding  L  chain. 
Specifically,  the  628  bp  HaeJll-HindlU.  botA  fragment  demonstrated  65%  homology  with  botB, 
whereas  the  389  bp  Hpal-^oW  botA  fragment  had  54.8%  homology  with  botB.  Comparative 
alignment  demonstrated  that,  in  general,  the  overall  DNA  homology  between  the  H  chain  and 
L  chain  encoding  regions  of  all  sequenced  neurotoxin  genes  reflected  the  level  of  amino  acid 
sequence  homology  (Tkble  2),  and  averaged  between  50  to  60%  identity.  One  consequence  of 
this  relative  dissimilarity  between  genes  is  that  DNA  probes  specific  to  each  toxin  gene  may  be 
easily  designed.  However,  although  there  is  sufficient  homology  in  certain  regions  to  derive  a 
generalised  probe  for  the  generic  detection  of  neurotoxin  genes,  it  has  not  proven  possible  to 
design  a  probe  which  hybridises  to  all  bot  genes  and  not  to  the  TeTx  gene  (unpublished  data). 


The  complete  amino  acid  sequence  of  BoNT/B. 

The  deduced  primary  sequence  of  BoNT/B  demonstrates  that  the  toxin  is  composed  of  1291 
amino  acid  residues.  By  comparison  to  partial  amino  acid  sequences  derived  from  purified 
polypeptides  from  other  C.  botulinum  type  B  strains,  it  is  apparent  that  variations  in  toxin 
structure  occur.  Thus  although  amino  acid  residues  2  through  17  exhibit  perfect  conformity  to 
the  sequence  derived  by  Edman  degradation  of  purified  BoNT/B  L  chain  of  strain  B/Okra 
(Sathyamoorthy  and  DasGupta,  1985),  the  amino  acid  at  position  23  of  the  H  chain  was 
determined  (DasGupta  and  Datta,  1988)  to  be  Arg  rather  than  the  Ser  residue  :een  here 
(position  464,  Fig.  4).  Similarly,  the  BoNT/B  of  strain  B/657  possesses  a  Met  amino  acid  at 
position  30  of  the  L  chain  (DasGupta  and  Datta,  1987)  compared  to  Thr  in  the  case  of  BoNT/B 
of  Danish  and  B/Okra.  Variations  in  the  primary  amino  acid  sequence  of  other  types  of  BoNT 
have  been  noted,  eg.,  between  BoNT/A  of  strain  62A  (Bini,  et  al.,  1990)  and  strain  NCTC 
2916  (Thompson  et  al.,  1990),  and  between  BoNT/E  of  strains  Beluga,  Mashike,  Iwanai, 
Otaru  and  NCTC  11219  (this  study).  In  the  case  of  BoNT/B,  such  variations  go  some  way  to 
explaining  observed  dissimilarity  in  the  immunological  properties  of  BoNT/B  isolated  from 
different  strains  (Hatheway  et  al.,  1981;  Notermans  et  al.,  1984). 
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1.4  CLONING/  SEQUENCE^G  OF  THE  BoNT/F  GILNE 


1.4.1  Summary 

A  total  of  4  overlapping  regions  from  the  C.  botulitium  type  F  genome  have  been  amplified 
by  PGR  and  cloned  into  plasmid  vectors.  Nucleotide  sequence  analysis  of  the  inserts  of  the 
resultant  plasmids  (pCBFl-4)  has  allowed  the  derivation  of  a  3590  bp  portion  of  the  botF 
structural  gene,  encoding  1196  amino  acid  residues.  From  comparative  alignment  with 
BoNT/E,  it  is  estimated  that  c.  20  codons  are  missing  from  the  5’-end  of  the  gene,  and  60 
codons  from  the  3 '-end  of  the  gene.  The  unsequenced  portion  of  the  insert  of  plasmid  pCBF3 
is  of  sufficit.it  size  to  comfortably  encode  the  missing  3 '-end  of  the  gene.  The  5 '-end  of  the 
gene  remains  to  be  cloned.  Furthermore,  the  sequence  is  only  of  a  preliminary  nature  as 
approximately  50%  has  only  be  derived  from  analysis  of  DNA  amplified  from  a  single  PCR, 
and  may  therefore  contain  Thq  polymerase  induced  errors.  The  amino  acid  sequence  available 
demonstrates  that  BoNT/F  is  highly  homologous  to  BoNT/E.  At  present  the  incomplete  H 
chain  shares  68%  identity  with  the  equivalent  region  of  the  BoNT/E  H  chain,  while  the 
respective  L  chains  exhibit  50%  similarity.  They  therefore  represent  the  most  closely  related 
neurotoxin  pairing. 


1.4.2  Results  and  Discussion 
Cloning  of  H  chain  encoding  DNA  by  PCR 

The  oligonucleotide  primers  HE2  and  HE5  (Thble  1)  had  previously  been  shown  to  effect 
the  amplification  of  a  1.2  kb  fragment  in  a  PCR  using  both  type  B  and  E  DNA  as  template. 
When  these  two  primers  were  employed  in  PCR  using  type  F  chromosomal  DNA,  an 
identically  sized  fragment  was  generated.  This  fragment  was  blunt-ended  by  treatment  with 
T4  DNA  polymerase  and,  following  its  isolation  from  an  agarose  gel,  inserted  into  the  Smal 
site  of  pMTL32.  The  entire  insert,  and  specific  subfragments,  were  excised  from  the 
recombinant  plasmid  (pCBFl,  Fig  6)  and  subcloned  into  M13mpl8  and  M13mpl9.  Templates 
prepared  from  the  various  recombinant  phages  were  then  subjected  to  nucleotide  sequence 
analysis  using  universal  primer.  In  certain  instances  the  sequence  obtained  with  a  particular 
template  was  extended  using  a  synthesised  sequence  specific  oligonucleotide.  Translation  of 
the  nucleotide  sequence  obtained  revealed  the  presence  of  a  continuous  ORF  exhibiting 
substantial  homology  (74.4%)  to  BoNT/E. 
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Cloning  of  contiguous  BoNT/F  encoding  DNA 


To  facilitate  the  cloning  of  regions  of  botF  contiguous  with  that  present  in  the  insert  of 
pCBFl,  plasmid  DNA  was  radiolabelled  and  used  in  Southern  blot  experiments  to  construct  a 
restriction  map  of  the  type  F  genome  (Fig.  6).  The  data  obtained  suggested  that  a  2.9  kb 
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Fig.  6.  BoNT/F  gene  cloning  strategy.  The  4  PCR-implified  regions  of  strain  Langeland 
chromosome,  that  were  cloned  in  the  recombinant  plasmids  pCBFl-4,  are  represented  by 
appropriate  boxes  below  the  restriction  map  of  the  region  of  the  genome  encoding  the  BoNT/F 
f^ene  (bold  line  =  light  chain,  hatched  box  =  heavy  chain).  An  open  box  (pCrfFl)  indicates  the 
iUnpiifieJ  region  was  obtained  in  a  standard  PCR,  the  dotted  boxes  (pCBF2-4)  represent  regions 
amplified  by  inverse  PCR.  The  vertical  dotted  lines  identifies  the  boundaries  of  the  concatenated 
restriction  fragments  employed  as  the  substrate  for  inverse  PCR,  using  primer  pairs  HFl  to  HF6 
(see  text  for  sequences).  Primers  HE2  and  HE5  are  those  used  in  the  cloning  of  the  equivalent 
region  of  botE.  Abbreviated  restriction  sites  arc:  RI,  fcwRI;  RV,  EcoRV,  and;  Hill,  /fi/idlll. 


EcoRl  fragment  encompased  the  cloned  insert  of  pCBFl.  As  this  fragment  encoded 
significant  further  portions  of  the  botF  gene,  it  was  targeted  for  cloning  by  a  strategy  involving 
inverse  PCR.  Type  F  chromosomal  DNA  was  cleaved  with  fcoRI,  incubated  with  T4  DNA 
ligase  and  the  resultant  concatenated  DNA  used  as  the  template  in  a  PCR  with  two 
oligonucleotides  primers  (HFl,  5’-CTCCTAATAATTCAAATGCCTCCTT-3’;  HF2,  5’- 
AACrrACmTTTAATTATACACAAAT-2')  complementary  to  sequences  at  the  proximal  and 
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distal  end  of  the  pCBFl  insert  (Fig.  6).  The  1.9  kb  fragment  amplified  was  blunt-ended  and 
cloned  into  the  Smal  site  of  pMTL32  to  yield  the  recombinant  plasmid  pCBF2.  The  insert  of 
this  plasHiid  was  excised  by  digestion  with  BglU  and  M13  templates  containing  random  inserts 
generated  using  the  sonication  procedure.  The  subsequent  nucleotide  sequence  data  obtained, 
in  combination  with  that  previously  obtaimxi  from  the  pCBFl  insert,  resulted  in  a  contiguous 
sequence  of  2,975  bp  in  length.  Upon  translation  an  uninterrupted  ORF  was  evident  encoding 
a  polypeptide  of  994  amino  acid  residues  exhibiting  66.4%  similarity  to  BoNT/E.  From  the 
alignment  obtained  between  this  polypeptide  sequence  and  BoNT/E  it  was  evident  that  a  DNA 
region  equivalent  to  some  150  codons  was  missing  from  the  3'-end  of  the  botF  gene,  and 
approximately  122  codons  from  the  5'-end. 


Cloning  of  the  5'-  and  3'-end  of  botF 

To  identify  restriction  fragments  encoding  the  5’-  and  3'-ends  of  botF  Southern  blot 
experiments  were  undertaken  using  type  F  genomic  DNA  cleaved  with  various  restriction 
enzymes  and  two  M13  recombinant  clones,  M13F16  and  M13F44,  as  radiolabelled  probes. 
These  two  M13  clones  contained  approximately  500  bp  DNA  inserts  derived  from  either  the 
proximal  or  distal  ends  of  the  sequenced  2.9  kb  EcoRI  fragment.  Using  probe  M1316  a  NspUl 
fragment  of  approximately  650  bp  in  size  was  identified  with  the  potential  to  encode  the  5’- 
end  of  botF,  while  the  probe  M13F44  identified  a  2.0  kb  HindUl  fragment  deemed  to  carry  the 
3*-end  of  the  gene.  To  clone  the  appropriate  coding  region  of  each  fragment,  PCR  was 
undertaken  with  concatenated  NspUl-  aiid  frf/ndlll-cleaved  type  F  chromosomal  DNA  with  the 
primers  HF5  (5'-TCAGGTCCTGCTCCCAATACAAGAAG-3’)  +  HF6  (5’- 
CCCCGTTAGAAAACTAATGGAlTCA-3’)  and  HF3  (5'-TTACTACTATATATTCC-3’)  + 
HF4  (5'-GATCCAAGTATCTTAAAAGACTTTT-3'),  respectively  (Fig.  6).  The  fragments 
amplified  (600  bp  in  the  case  of  HF5  +  HF6,  and  1.5  kb  in  the  case  of  HF3  +  HF4)  were 
cloned  directly  into  pCRlOOO  to  give  the  recombinant  plasmids  pCBF4  and  pCBF3, 
respectively  (Fig.  6). 

The  entire  insert  of  pCBF3,  and  a  0.8  kb  EcoTil-HindlU  subfragment  of  the  pCBF4  insert, 
were  subcloned  into  M13mpl8  and  M13mpl9  and  the  resultant  templates  sequenced  using 
universal  primer.  In  the  case  of  the  pCB4  -derived  templates,  the  sequence  obtained  proved  to 
be  contiguous  with  that  of  the  insert  of  pCBF2,  however,  alignment  of  the  translated  encoded 
polypeptide  with  BoNT/E  revealed  that  the  extreme  5’-end  of  the  gene  had  not  been  cloned. 
Thus  if  BoNT/F  has  an  identical  number  of  amino  acid  residues  at  its  aminoterminus,  20 
codons  are  missing  from  the  start  of  the  gene.  In  the  case  of  the  pCBF4-derived  templates,  the 
M13mpl9  recombinant  template  extended  the  botF  gene  by  399  nucleotides. 
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Current  status  of  the  botF  nucleotide  sequence 


The  extent  of  the  botF  nucleotide  sequence  currently  known  is  illustrated  in  Fig.  7.  A  total 
of  3590  bp  in  length,  it  encodes  1196  amino  acid  residues.  By  comparison  to  BoNT/E,  the 


NOIPrEEKSICKYTKAFElMRNVUIIPEII<ITtGT 
ATCCACaOACCATATGAAGAAAAAAGTAAAAAATATTATAAAGCTTTTGAGATTATCCCTAATGTTTGGATAATTCCTCAGAGAAATACAATAGGAACGG  100 
NspH  Hindi! I 

OPSDFDPPASLENGSSAYYOPNYLTTDAEKDRYL 
ATCCTAOTGATTTTGATCCACCGGCTTCATTAGAGAACGGAAGCAGTGCTTATTATGATCCTAATTATTTAACCACTGATCCTGAAAAAGATACATAITT  200 

KTTIKIFKRIHSHPAGEVLLOEISYAKPYLGHE 
AAAAACAACGATAAAATTATTTAAGAGAATTAATAGTAATCCTGCAGGGGAAGITTTCTTACAACAAATATUTATGCTAAACCATATTTAGGAAATGAA  300 

HYPIMEFHPVTRTTSVHirSSTHVrSSllLMLL 
CACACGCCAATTAATGAATTCCATCCAGTTACTAGAACTACAAGTGTTAATATAAAATCAICAACTAATGTTAAAAGTTCAATAATATTGAATCTTCTTG  400 
EcoRt 

VLGAGPOIFENSSYPVRKLNOSCGVYDPSNDCFG 
TATTGGGAGCAGGACCTGATATATTTGAAAATTCTTCTT.'CCCCGTTAGAAAACTAATGGATTCACGTGGAGTTTATGACCCAACTAATGATGGTTTTGG  500 

SIHIVTFSPEYEYTFHOISGGYHSSTESFIADP 
ATCAATTAATATCGTGACATTTTCACCTGAATATGAATATACTTTTAATGATATTAGTGGAGGGTATAACAGTAGTACACAATCAnTATTGCAGATCCT  600 

AIAIAMELIHALHGLYCARGVTYKETIICVICQAP 
GCAATTTCAtrAGCTCATGAATTGATACATGCACTGCATGGATTATACCGGGCTAGGGGAGTTACTTATAAAGAGACTATAAAAGTAAAGCAAGCACCTC  700 
NspH 

LMlAEKPIRLEEFLTFCGQDLNItTSANKEICIYN 
TTATGATAGCCGAAAAACCCATAAGGCTAGAAG,'.\T''TTTAACCTTTGGAGGTCAGGATTTAAATATTATTACTAGTGCTATGAAGGAAAAAATATATAA  800 

NLLANYEKIATRLSRVHSAPPEYOIHEYEDYFQ 
CAATCTTTTAGCTAACTATCAAAAAATAGCTACTAGACTTAGTAGAGTTAAIAGTGCTCCTCCTGAATATGATATTAATCAATATAAAGATTATTTTCAA  900 

UKYGt.0ICNADGSrTVHE!1KFNEIYKKlYSFTEI 
TGGAAGTATGCGCTAGATAAAAATGCTGATGGAAGTTATACTGTAAATGAAAATAAATTTAATGAAATTTATAAAAAATTArATACCTTTACAGAGATTG  1000 

OLANKFICVKCRNTYFIKYGFIKVPNILOBDIYTV 
ACTTAGCAAATAAATTTAAAGTAAAATGTAGAAATACTTATTTTATTAAATATGGATTTTTAAAAGTTCCAAATTTGTTAGAIGATCATATTTATACTGT  1100 

SEGFNtGNLAVNHRGQMtKlNPKtlOSiPOICGL 
ATCAGAGGCGTTTAATATAGGTAATTTACCAGTAAACAATCGCCGACAAAATATAAAGTTAAATCCTAAAATTATTGAITCCATTCCAGATAAACGTCTA  1200 

VEKlVKFCKSVIPRICGTrAPPRLCIRVHHRELF 
GTCCAAAAGATCCTTAAATTTTGTAAGAGCGTTATTCCTAGAAAAGGTACAAAGGCCCCACCGCGACTATCCATTAGACTAAATAATAGCGAGTTATTTT  1300 

FVAJESSTNENOINTPKEIDOTTNLHNNYRMNLO 
TTGTAGCTTCAGAAAGTAGCTATAATGAAAATGATArTAATACACCTAAAGAAATTGACGATACAACAAATCTAAATAATAATTATAGAAATAATTTAGA  1400 

EVILOYNSETIPQISNOTLNTLVOODSYVPRYO 
TGAAGTTATITTAGATTATAATACTGAGAi_''ATACCTCAAATArCAAATCAAACATTAAATACACTTGTACAAGACGATAGTTATGTGCCAAGATATGAT  1500 

SNGTSEIEEHNVVOLNVFFYLNAQAVPEGETNI 
TCTAATCGAACAAGTGAAATAGAGGAACATAATGTTGTTGACCTTAATGTArTTTTCTAITTACATCCACAAAAAGTACCAGAAGGTGAAACTAATATAA  1600 

SLTSSIOTALSEESOVYTFFSSEFIHTIRKPVHA 

CTTTAACTTCTTCAATTGATACGGCATTATCAGAAGAATCGCAAGTATATACATTCTTTTCTTCACAGTTTATTAATACTATCAATAAACCTGTACACGC  1700 

ALFISUINQVIROFTTEATOKSTFDKIABISLV 
AGCACTATTTATAAGTTGGATAAATCAAGTAATAAGAGATTTTACTACTCAAGCTACACAAAAAAGTACTTTTGATAAGATTGCAGACATATCTTTAGTT  1800 

Seal 

VPYVGLALNIGNEVOKENFKEAFELLOAGtLLE 
GTACCATATGTAGGTCTTGCTTTAAATATAGGTAATGAGGTACAAAAAGAAAATTTTAAGGAGGCATTTGAATTATTAGCAGCCCGTATTTTATTAGAAT  1900 

FVPEtLIPflLVFTIKSFIGSSEHKHKiriCAlNH 
TTGTGCCAGACCTTTTAATTCCTACAATTTTAGTGTTTACAATAAAATCCTTTAIAGGTTCATCTCAGAATAAAAATAAAATCAITAAAGCAATAAATAA  2000 

SINERETKWXF.  ITSUIVSNWLTRIHTOFNKRKE 
TTCATTAATGGAAAGAGAAACAAAGTGGAAACAAATATATAGTTGGATAGTATCAAATICGCTTACTAGAATTAATACACAArTTMTAAAAGAAAAGAA  2100 


Fig  7.  Partial  nucleotide  sfqufr.'re  of  the  type  F  gene.  The  illustrated  sequence  was  derived 
by  amalgamating  the  nucleotide  sequences  of  the  inserts  of  plasmids  pCBFl  to  pCBF4  (Fig.  6).  The 
BoNT/F  amino  acid  sequence  is  given  in  the  single  code  above  the  central  nucleotide  of  the 
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QHYQALUMOVDAIKTVIETKYNNTTSDERMRLE 
CAAATGTATCAAGCTTTGCAAAATCAAGTAGATGCAATAAAAACAGTAATAGAATATAAATATAATAATTATACTTCAGATGAGAGAAATAGACTTGAAT  2200 
Hindlll 

SEYMINNlREELHKICVSLAMENtERPITESSlEY 
CTGAATATAATATCAATAATATAAGAGAAGAATTGAACAAAAAAGTTTCTTTAGCAATGGAAAATATAGAGAGATTTATAACAGAGAGTTCTATATTTTA  2300 

LNKLINEAKVSKLREYOEGVKEYLLDYISEHRS 
TTTAATGAAGTTAATAAATGAAGCCAAAGTTAGTAAATTAAGAGAATATGATCAAGGCGTTAAGGAATATTTGCTACACTATATTTCAGAArATAGATCA  2400 

IL6HSVOELNOLVTSTLMMSIPFELSSYTMDKI 
ATTTTAGGAAATAGTGTACAAGAATTAAATGATTTAGTGACTAGTACTCTGAATAATAGTATTCCATTTGAACTTTCTTCATATACTAATGATAAAATTC  2500 

Scat 

LILYFNKLYKKIKDNSILOHRYENNKFIOISGYG 
TAATTnATATTTTAATAAATTATATAAAAAAATTAAAGATAACTCTATTTTAGAIATGCGATTGAAAATAATAAATTTATAGATATCTCTGCATATGG  2600 

EcoRV 

SNISINGOVYIYSTRRNQFGIYSSKPSEVNIAQ 
TTCAAATAT'AGCATTAATGGAGATGTATATATTTATTCAACAAATAGAAATCAATTTCGAATATATAGTAGTAAGCCTAGTGAAGTTAATATAGCTCAA  2700 

HMOIIYNGRYQMFSISFWVRIPKYFNKVNLMME 

AATAATGATATTATATACAATGGTAGATATCAAAATTTTAGTATTAGTTTC7GGCTAAGGATTCCTAAATACTTCAATAAAGTGAATCTTAATAATGAAT  2800 
EcoRV 

YTIIOCIRMNNSGUKISLMYNKItUTLODTAGMM 
ATACTATAATAGATTGTATAAGGAATAATAATTCAGGATGGAAAATATCACTTAATTATAATAAAATAATTTGGACTTTACAAGATACTGCTGGAAATAA  2900 

OKLVFNYTOHISISOYIHKUIFVTITNNRLGMS 
TCAAAAACrAGTTTTTAATTATACACAAATGATTAGTATATCTGATTATATAAATAAATGGATTTTTGTAACTATTACTAATAATAGATTAGGCAATTCT  3000 

RlYlMGNLIOEKStSNLGDIHVSONtLFKIVGC 
AGAATTTACA'CAATGGAAATTTAATAGATGAAAAATCAATTTCGAATTTAGGTGATATTCATGITAGTGATAATATATTATTTAAAATTGTTGGTTGTA  3100 

MOTRYVGIRYFKVFDTElGICTEtETLYSOEPOPS 
ATGATACAAGATATGTTGGTATAACATATTTTAAAGTTTTTGATACGGAATTAGGTAAAACAGAAATTGAGACTTTATATAGTGATGAGCCAGATCCAAG  3200 

ILrOFUGRYLLYXrRYYLLMLLRTDKSITOMSR 
TATCTTAAAAGACTTTTGGGGAAATTATTTGTTATATAATAAAAGATATTATTTATTGAATTTACTAAGAACAGATAAGTCTATTACTCAGAATTCAAAC  3300 

EccAt 

FIRIROQRGVYOKPMIFSRTRLYTGVEVIIRKR 
TTTCTAAATATTAATCAACAAAGAGGTGTTTATCAGAAACCAAATATTTTTTCCAACACTAGATTATATACAGGAGTAGAAGTTATTATAAGAAAAAATG  3400 

GSTOISRTDNFVRKROLAYIIIVVORDVOYRIYAD 
GATCTACAGATATATCTAATACAGATAATTTTGTTAGAAAAAATCATCTGGCATATATTAATGTAGTAGATCGTGATGTAGAATATCGGCTATATGCTGA  3500 

tSIAKPEKI  IKLtRTSNSNMSLGQt  tVMDS 
TATATCAATTGCAAAACCAGAGAAAATAATAAAATTAATAAGAACATCTAATTCAAACAATAGCTTAGGTCAAATTATAGTTATGGATTC  3590 


Fig  7.  Partial  nucleotide  sequence  of  the  tyjn  F  gene,  (continued) 

corresponding  codon.  It  should  be  noted  that  the  sequence  is  of  a  preliminary  nature,  having  been 
derived  in  certain  regions  from  a  single  PCR-ampIified  DNA  fragment. 


sequence  currently  lacks  the  coding  potential  for  c.  20  amino  acids  at  the  aminoterminus,  and 
c.  60  amino  acids  at  the  carboxyterminus.  In  addition,  approximately  half  the  sequence  has 
only  been  derived  from  a  single  PCR-derived  cloned  fragment.  Two  further  isolates  of 
plasmid  clones  pCBFl  to  pCBF4  have  now  been  derived  from  two  further  independently  PCR 
amplified  fragments,  and  the  inserts  of  a  single  representative  of  each  is  being  sequenced.  To 
date  only  3  discrepancies  with  the  sequences  obtained  from  the  original  clones  have  been 
detected.  Thus  the  error  rate  seems  considerably  lower  than  that  observed  during  the 
amplification  of  type  E  DNA.  The  reason  for  this  is  unclear,  the  condition  of  PCR  being 
identical,  but  may  be  caused  by  the  use  of  different  sources  of  Tkq  polymerase.  Thus,  during 
the  period  that  type  E  DNA  was  being  amplified  enzyme  from  Palliard  Chemical  Co  was  in 
widespread  use  within  this  laboratory.  More  recently,  the  supplier  of  Thq  polymerase  has 
consistently  been  Cetus. 


The  unsequenced  portion  of  the  insert  of  plasmid  pCBF3  is  of  sufficient  size  to  easily 
accommodate  the  missing  3'-end  of  the  gene.  Further  sequencing  of  this  clone  should  therefore 
allow  the  derivation  of  the  carboxyterminuS  of  BoNT/F.  The  DNA  region  encoding  the  amino 
terminus  still  remains  to  be  cloned.  In  view  of  the  high  degree  of  DNA  homology  between 
botE  and  botF,  it  is  planned  to  amplify  an  appropriate  fragment  in  a  standard  PCR  reaction 
using  a  "sense"  primer  based  on  the  5’  non-coding  region  of  botE,  and  an  anti-sense  primer 
based  on  the  sequence  derived  from  pCBF4. 


1.5  CLONING/ SEQUF.NCING  OF  THE  BoNT/G  GENE 
1.5.1  Summary 

A  1050  bp  fragment  was  PCR  amplified  from  type  G  chromosome,  using  prirners  designed 
for  the  detection  of  the  type  A  gene,  cloned  into  pCRlOOO  and  250  bp  of  sequence  information 
determined  from  either  end.  This  sequence  proved  to  be  indistinguishable  from  botA, 
indicating  that  the  chromosome  had  been  prepared  from  a  type  G  culture  contaminated  with 
type  A  cells. 


Attempted  cloning  of  a  H  chain  encoding  region  of  the  BoNT/G  gene 

During  the  course  of  a  parallel  programme  of  work,  in  which  oligonucleotide  primers  for 
the  detection  of  toxin  genes  were  being  evaluated,  it  was  noted  that  primers  based  on  botA  and 
botB  sequence  consistently  amplified  specific  DNA  fragments  from  type  G  chromosomal 
DNA.  In  one  pyjticular  case  the  intensity  and  size  of  the  fragment  generated  was  equivalent  to 
that  seen  with  the  intended  target  DNA,  that  of  type  A  chromosome.  This  1050  bp  fragment 
was  therefor^'  cloned  directly  into  pCRlOOO  and  the  proximal  and  distal  regions  of  the  insert  of 
the  resultant  recombinant  plasmid  analysed  in  a  plasmid  sequence  reaction  using  universal  and 
reverse  primer,  respectively.  A  total  of  some  250  bp  of  sequence  information  was  obtained 
with  each  primers,  however,  the  two  sequences  proved  to  be  identical  to  botA.  It  was 
concluded  that  the  culture  from  which  the  chromosomal  DNA  had  been  prepared  was 
contaminated  with  C.  botuUnum  type  A. 


1.6  AMINO  ACID  HOMOLOGIES  BETWEEN  CHARACTERISED  NEUROTOXEsS 

Pairwise  comparisons  of  the  respective  L  and  H  chain  components  of  all  7  toxins  was 
undertaken  and  the  results  summarised  in  Table  2  (it  should  be  noted  that  the  amino  acid 


31 


sequences  of  the  L  and  H  chain  of  BoNT/F  are  incomplete).  From  this  it  can  be  seen  that, 
with  notable  exceptions,  the  overall  level  of  identity  between  L  chains  varies  from  around  30 
to  35%.  The  four  exceptions  are  the  degree  of  homology  seen  between  BoNT/E  and  BoNT/F 
(56%),  BoNT/E  and  TeTx  (40%),  BoNT/C  and  BoNT/D  (47%)  and  BoNT/B  and  TeTx 
(50%).  The  fact  that  certain  BoNT's  (BoNT/B,  BoNT/E  and  BoNT/F)  exhibit  a  greater 
degree  of  homology  to  the  TeTx  L  chain  than  to  other  BoNT  L  chams  is  particularly  striking. 
These  homologies  serve  to  emphasise  the  close  relationship  that  exists  between  the 
pharmacological  action  of  BoNT  and  TeTx.  With  the  exception  of  TeTx,  the  H  chains  exhibit 
a  much  broader  spread  of  %  similarity  values  than  the  L  chains.  The  highest  dgree  of 


NEUROTOXIN  HOMOLOGIES 


Table  2.  Amino  acid  tiomology  between  the  L  and  H  chain  components  of  the  different  types  of 
BoNT  and  TeTx.  Figures  represent  the  %  identity  between  di-chain  components.  The  upper 
quadrant  contains  H  chain  comparisons,  the  lower  L  chain  homologies.  A,  B,  C,  D,  E,  and  F 
refer  to  the  respective  BoNT,  TET  represents  TeTx.  The  data  from  BoNT/F  is  incomplete. 

similarity  is  that  found  between  BoNT/E  and  BoNT/F  (68%),  closely  followed  by  th  56% 
similarity  between  the  H  chains  of  BoNT/C  and  EoNT/D.  The  overall  disimilarity  of  the 
TeTx  H  chain  to  BoNT's  is  consistent  with  the  view  that  this  region  is  responsible  for  the 
essential  difference  between  these  neurotoxins,  viz,  their  site  of  action. 


Purely  on  the  basis  of  H  chain  comparisons,  the  BoNT's  may  be  conveniently  split  into  3 


pairings,  viz,  BoNT/A  and  BoNT/B,  BoNT/C  and  BoNT/D,  and  BoNT/E  and  BoNT/F.  The 
latter  two  pairings  also  appear  to  hold  for  the  L  chains,  however,  the  BoNT/A  and  BoNT/B  L 
chains  represent  the  most  dissimilar  pairing.  These  relationships  are  best  illustrated  by  the 
phylogenic  tree  illustrated  in  Fig.  8.  The  variance  seen  in  the  relative  order  of  relatedness 
between  toxins  dependent  of  which  component  of  the  dichain  that  is  compared  is  intriguing.  It 
suggest  that  either  L  and  H  chain  domains  of  an  individual  neurotoxin  have  evolved  at 
disproportionate  rates,  or  that  at  various  stages  during  evolution  hybrid  toxins  have  arisen  by 
fusion  of  distinct  H  and  L  chain  encoding  regions. 


L  CHAINS 


H  CHAINS 


A 


Fig.  8.  Phylogenic  relationships  btiween  the  H  and  L  chains  of  cUtstT^aia'i  neurotoxina.  The 
distance  of  the  line  along  the  x  axis  is  indicative  of  degree  of  divergence. 

An  alignment  of  the  entire  amino  acid  sequences  of  BoNT/A,  3,  C,  D,  E  and  TeTx,  and 
the  part'al  amino  acid  sequence  of  BoNT/F  is  presented  in  Fig.  9.  Regions  of  sequence 
similarity  have  been  boxed.  This  demonstrates  that  the  neurotoxini.  are  composed  of  highly 
conserved  amino  acid  domains  interspersed  with  amino  acid  tracts  exhibiting  little  overall 
similarity.  Disregarding  the  incomplete  EoNT/F  sequence,  within  the  L  chain  region  (average 
size  442),  68  amino  acids  are  totally  conserved.  1 1  of  these  conserved  amino  acids  reside  in  a 
region  (position  216  to  234  of  Bc  NT/B)  which  encompasses  a  histidine  rich  motif.  The  three 
conserved  His  residues  of  this  region  have  previously,  on  the  basis  of  their  conservation  in 
BoNT/A,  BoNT/E  and  TeTx,  been  suggested  to  play  some  role  in  the  oresumed  catalytic 
activity  of  the  L  chain  (Binz  et  al.,  1990).  Their  conservation  in  all  7  neurotoxins  does  not 
detract  from  this  hypothesis.  Preliminary  work,  however,  in  which  site-directed  mutagenesis 
has  been  used  to  effect  amino  acid  substitutions  at  all  three  His  positions  did  not  effect  the 
toxicity  of  a  BoNT/A  subunit  in  an  Aplysia  California'  buccal  ganglian  model  system  (Binz  et 
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Fig.  9.  Full  alignment  of  all  htawn  dottiidial  neurotoxin  sequences. 
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Fig.  9.  Full  alignment  of  all  known  clostridial  neurotaxin  sequences. 


BOTA 

BOTB 

BOTC 

BOTO 

BOTB 

BOTP 

TET 


K  I 

V  R 

N  N 

"o' 

R 

V  Y  I  N 

D  I 

7  R 

K  E 

D 

Y 

I  Y  L  0 

T  R 

7  R 

G  G 

D 

I 

L  Yino 

S  R 

lE 

N  G 

D 

N 

iiiIlIh 

W  L 

7  ft 

K  K 

D 

0 

V  Y  I  N1 

N  ? 

7  R 

K  N 

0 

L 

TOY  I  h| 

s  r 

2JL 

S  G 

liJ 

F 

TprijjY 

IPLYKKHEAVKLRO 


1260V 

I  A  K  LfVlA 
K  D  Y  P  C  I 
R  P  R[gG 
R  P  S  P  K  N 


[0]K®A  S  L  G[r]V  G  T  H 


■TT 

W  Y 
W  Y 

D  T  V[V]a[|]t[w^ 

R  D  p  «  R  D  I  LmAinwrff~yi 


R  H  H  Y  L 


BOTA 

BOTB 

BOTC 

BOTD 

BOTE 

BOTP 

TTT 


1270V  1280V  _  1290V 


R  R  Q 
L  X  E 
V  P  T 

rT\E  R  S  S  R  T  L  '5  -(T  S 

V^R  KPYRLX  LGCN 

V  idQ  CNYASLLES  T  STTf 

nni 

w  < 
w  < 

s  ~n: 

3  F  I 
;  T  V 

p  vrmofTTjGjTiR  p  L 
p  Kt^ElijrjTUj 

P  V  S  E 

P 

Y  T  H 

IvlA  V  TRYBTKLLSTSSP 
M[R]D  H  T  H  S  MrG~cly 

W  I 

LeJ  ^ 

C  T  I 
^  T  1 

S  RfTlPia  RjVfri 
s  eIeJh[^q[^k 

P  H  HIL  KiD  K  I  LIG  CIOnn  Ynrvip  TpSlEmnT  K  0 


r296 

1291 

1291 

1276 

12S2 

1315 


1191 

1178 

1179 
1166 
1160 
1183 
1207 


1232 

1218 

1325 

1214 

1194 

1250 


1267 

1263 

1264 
1247 
1225 

1290 


Fig.  9.  Full  alignmenX  of  all  known  clostridial  neurotoxin  sequences..  The  illustrated  alignment 
was  essentially  derived  using  the  computer  programme  CLUSTAL  (Higgins  and  Sharp,  1988), 
and  has  been  gapped  to  maximise  homology.  Dashes  correspond  to  regions  of  the  BoNT/F 
sequence  which  have  yet  to  be  determined.  Highly  conserved  regions  have  been  boxed,  and 
include  areas  in  which  conservative  replacements  have  occurred,  in  addition  to  sequence 
identity.  Amino  acids  conserved  in  at  least  6  out  of  7  toxins  have  been  emboldened. 
Numbering  above  the  alignment  corresponds  to  BoNT/A.  The  Cys  amino  acids  presumed  to  be 
involved  in  the  formation  of  the  disulphide  bridge  between  neuroloxin  L  and  H  chains  are 
marked  by  upward  facing  arrow.*;. 


al.,  1991). 

Ignoring  the  incomplete  BoNT/F  sequence,  within  the  H  chain  region  (average  size  845 
amino  acids)  110  amino  acids  are  absolutely  conserved.  Most  notable  is  the  high  degree  of 
conservation  of  Trp  amino  acids.  Thus,  for  instance,  of  the  12  Trp  residues  which  occur  in  the 
BoNT/E  H  chain,  9  are  absolutely  conserved  in  all  toxins,  while  the  remaining  3  are  conserved 
in  all  but  one  of  the  neurotoxins  at  each  position.  The  only  Trp  that  occurs  in  the  BoNT/E  L 
chain  is  conser\'ed  in  all  neurotoxins.  The  functional  significance  of  the  apparent  evolutionary 
pressure  for  maintaining  this  amino  acid,  or  chemically  similar  residues,  at  these  positions  in 
BoNT  and  TeTx  remains  unknown.  However,  previous  studies  in  which  BoNT  Trp  residues 
have  been  selectively  modified  by  chemical  means  has  established  a  crucial  role  in  both 
toxicity  and  immunogenicity  (see  Dasgupta,  1990).  Indeed,  in  one  study  the  inactivation  of  a 
single  Trp  resulted  in  near  complete  detoxification  (Shibaeva  et  al.,  1981,  cited  in  DasGupta, 
1990).  The  selective  disruption  of  conserved  Trp  amino  acids  in  BoNT  by  site-directed 
mutagenesis  should  help  identify  which  residue(s)  are  important  in  toxicity  and  antigenicity. 


The  most  notable  tract  of  sequence  divergence  between  the  toxins  resides,  with  the 
exception  of  the  extreme  10  or  so  amino  acids,  in  the  COOH-termini  of  the  toxins  (position 
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Fig.  10.  Hydrophobicity  plots  of  all  currently  characterised  clostridial  neurotoxins. 
Hydrophobicity  was  calculated  using  the  computer  program  of  Kyte  and  Doolittle  (22)  with  a 
window  size  of  9  amino  acids.  The  average  value  for  each  toxin  was:-  BoNT/A,  -0.37; 
BoNT/B,  -0.42;  BoNT/C,  -0.41;  BoNT/D.  -0.36;  BoNT/E,  -0.45,  and;  TeTx.,  -0.37.  The 
conserved  hydrophobic  region  is  indicated  below  each  profile  by  a  barred  line.  The  respective 
residues  involved  are  652  through  687  (BoNT/A).  642  through  671  (BoNT/B),  648  through  678 
(BoNT/C),  646  through  674  (BoNT/D),  624  through  654  (BoNT/E)  and  660  through  691 
(TeTx). 

1 1 17  onwards  of  BoNT/A).  Divergence  in  this  latter  area  would  appear  consistent  with  the 
notion  that  this  domain  is  involved  in  BoNT  binding,  and  that  the  different  toxins  target 
different  acceptors  on  the  cell  surface.  The  presence  of  the  conserved  motif 
WXFI/VXXXXGW  at  the  extreme  COOH-terminus  of  all  neurotoxins  (except  BoNT/C,  where 
the  terminal  GW  is  missing,  and  BoNT/F  which  has  yet  to  be  sequenced  in  this  region)  is 
especially  noteworthy,  considering  the  degree  of  diversity  of  the  preceding  1(X)  amino  acids. 

The  algorithms  of  Chou  and  Fassman  (1978)  and  Gamier  et  al.  (1978)  were  employed  to 
derive  predictive  representations  of  BoNT  and  TeTx  secondary  structure  (data  not  shown). 
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The  results  obtained  went  some  way  towards  confirming  the  observations  of  a  comparative 
structural  analysis  undertaken  with  purified  BoNT/A  and  BoNT/E  (Singh  et  al.,  1990).  Thus, 
the  BoNT/E  is  predicted  to  contain  a  lower  a-helix  content  than  BoNT/A  (BoNT/E,  20%; 
BoNT/A  27%),  and  a  correspondingly  higher  content  of  jS-sheet  (BoNT/E,  52%;  BoNT/A, 
46%).  No  common  pattern  between  the  predicted  structures  of  each  neurotoxin  was, 
however,  apparent.  In  contrast,  a  hydrophilicity  Malysis  by  the  method  of  Kyte  and  Doolittle 
(1982)  demonstrated  a  high  degree  of  conservation  between  all  7  neurotoxins  in  their 
arrangement  of  polar  and  nonpolar  amino  acids  (see  Fig.  10).  A  similar  previous  analysis  of 
TeTx  (Eisel  et  al.,  1986)  and  BoNT/A  (Thompson  et  al.,  1990)  had  concluded  that  the  H  chmn 
of  these  particular  toxins  contained  a  common  domain  (TeTx,  660  through  691;  BoNT/A  652 
through  687;  Fig.  9)  of  sufficient  length  and  hydrophobicity  to  possess  membrane  spanning 
potential.  The  equivalent  hydrophobic  domains  (Fig.  9)  are  also  conserved  in  BoNT/B  (642 
through  671)  BoNT/E  (624  through  654),  BoNT/F  (643  through  673),  BoNT/C  (648  through 
678)  and  BoNT/D  (646  through  674). 


2.  EXPRESSION  SYSTEM  DEVELOPMENT 


2.1  MATERIALS  AND  METHODS 
Bacterial  strains,  plasmids  and  culture  conditions. 

The  E.  coli  and  Bacillus  subtilis  strains  routinely  used  as  the  host  for  recombinant 
experiments  were  TGI  (A[/ae-pro]  supE  thi  hsdDSl  F'-  traD36 pro/L  B'  lacf  lacZAMlS)  and 
168  trpC,  respectively.  The  Clostridium  acetobutylicum  strain  employed  was  NCIB  8052. 
The  strains  of  Clostridium  sporogenes  tested  were;  BM1091,  BM1706,  BM1758,  BM1759, 
BM1761,  BM1763,  BM1764,  BM1765,  BM1767,  BM1768,  BM1769,  BM1774,  BM1775, 
BM1776,  BM1780,  BM1781,  BM1783,  BM1784,  BM2130  and  BM2131.  All  strains  were 
obtained  from  Dr.  M  Hudson,  Pathology  Division,  PHLS  CAMR.  Recombinant  plasmids 
employed  were  the  pMTL20  senes  of  cloning  vectors  (Chambers  et  al.,  1988),  the  replicon 
cloning  vectors  pMl'L20/21E  and  pMTL20/21C  (Swinfleld  et  al.,  1990),  pAMBl-derivcd 
shuttle  vectors  pMTL500E/C  and  pCTCl  (Oultram  et  al.,  1988a;  Swinfield  et  al.,  1990; 
Williams  et  al.,  1989a  &  b),  and  the  Clostridium  shuttle  vectors  pCB3  and  pCTCSOl  (Young 
etal.,  1989). 

All  clostridial  cultures  were  routinely  grown  in  2x  YTG  medium  ( 1.6%  tryptone,  1.0% 
yeast  extract,  0.5%  NaCl,  and  0.5%  glucose).  In  certain  instances  commercially  obtained 
(Oxoid)  reinforced  clostridial  medium  (RCM)  was  employed,  and  on  other  occasions  the  basal 
medium  of  O'Brien  and  Morris  (1971).  AH  manipulations  were  undertaken  under  anaerobic 
conditions  using  an  Anaerobic  Work  Station  Mark  III  (Don  Whitley  Scientific,  UK).  The 
incubation  temperature  was  routinely  37°C. 


Plasmid  isolation  methodology. 

Plasmid  DNA  was  isolated  from  E.  coli  was  as  described  in  1.2  of  this  report.  Plasmid 
DNA  from  clostridial  strains  was  isolated  by  an  alkaline  lysis  procedure.  Cells  from  a  10ml 
volume  of  culture,  grown  overnight  in  2x  YTG,  were  harvested  by  centrifugation  and 
resuspended  in  lOO^l  of  50mM  Tris-HCl,  25%  (w/v)  sucrose,  5  mM  EDTA,  pH  7.0,  and 
lysozyme  added  to  lOmg/ml.  Following  an  incubation  period  of  60  min,  at  37°C,  a  200;il 
aliquot  of  freshly  prepared  0.2  N  NaOH,  1%  SDS,  was  added  and  the  tube  inverted  before 
being  placed  on  ice  for  5  min.  A  150  /xl  aliquot  of  ice-cold  potassium  acetate  solution  (5M 
potassium  acetate:  glacial  acetic  acid:  dH^O,  60:11.5:28.5)),  was  added,  mixed  by  vortexing 
and  the  tube  stored  on  ice  for  5  min.  Following  centrifugation,  for  10  min  in  a  microfuge,  the 


supernatant  to  was  transferred  to  a  fresh  1.5  ml  eppendorf  tube  and  an  equal  volume  of 
phcnol/chloroform  (1:1)  added.  After  vortexing  and  centrifugation  in  a  microfuge  the  upper 
aqueous  layer  was  carefully  removed,  mixed  with  2  volumes  of  ethanol  and  allowed  to  stand  at 
room  temperature  for  two  min.  The  DNA  was  precipitated  by  centrifugation,  dried  and 
resuspended  in  an  appropriate  volume  of  TE  buffer. 


Electroporction 

A  loopful  of  fresh  culture  was  used  to  inoculate  500  ^1  of  2  X  YTG  and  this  was  then  used 
to  set  up  dilutions  from  10'*  to  10“**  in  5  ml  volumes  by  serial  dilution.  Cultures  were  grown 
overnight.  The  two  lowest  dilutions  which  had  grown  were  used  as  inoculum  for  lOOmI  2x 
YTG  which  was  grown  to  an  OD  at  600nm  of  0.5  -  0.6  (mid-exponential  growth),  cooled  on 
ice  for  a  few  minutes,  then  harvested  by  centrifugation  at  5000rpm  for  10  min.  The  cell  pellet 
was  washed  in  5ml  ice-cold  electroporation  buffer  (270  mM  sucrose,  1  mM  MgClj,  7  mM 
sodium  phosphate  buffer,  pH  7.4)  and  harvested  by  centrifugation  as  above.  The  pellet  was 
finally  resuspended  in  5ml  ice-cold  electroporation  buffer  and  held  on  ice.  One  /xg  DNA  was 
added  to  each  cuvette  (0.2  cm  inter-electrode  diameter)  followed  by  300  /xl  cell  culture.  The 
cuvettes  were  sealed  with  plastic  insert.  A  single  pulse  was  delivered:  1.25kV,  lOOohms, 
25/xFD.  (Time  constant  approx.  1.7ms).  The  culture  was  removed  from  cuvette  by  washing 
with  1ml  2x  YTG  and  added  to  a  final  volume  of  3ml  of  2x  YTG  {ie  a  1  in  10  dilution).  A 
three  hour  expression  period  was  followed  by  harvesting  by  centrifugation  as  above.  The  pellet 
was  resuspended  in  2(X)  /*1  of  2x  YTG  and  100  /xl  volumes  plated  on  selective  agar,  containing 
freshly  prepared  catalase  (final  concentration  of  400  units/ml). 

As  far  as  possible,  all  manipulations  were  carried  out  in  an  anaerobic  cabinet  and  all  media 
and  buffers  were  allowed  to  equilibrate  in  anaerobic  conditions  overnight.  The  Biorad  "Gene 
Pulser*  was  used  routinely  as  the  electroporation  apparatus. 


Conjugation 

E.  coli  cultures  were  grown  overnight  to  OD  at  600nm  of  >4.0  and  C.  acetobutylicum 
cultures  (mid-exponential  phase)  to  an  OD  at  6(X)nm  of  0.6.  The  donor  and  recipient  cultures 
were  mixed  in  a  1000: 1  ratio  within  a  total  volume  of  2ml,  passed  through  a  sterile  0.45^m 
pore  size  filter  (2.5cm  in  diameter)  and  the  filter  was  incubated  upright  overnight  on  reinforced 
clostridial  medium  supplemented  with  2  mg  of  catalase.  Growth  on  the  filter  was  harvested  by 
vortexing  in  500  /xl  25  mM  potassium  phosphate  pH  7.0,  1  mM  MgSO^  and  100  /xl  volumes 
were  plated  on  clostridial  basal  medium  supplemented  with  10  ng  trimethopr'm  (to  select 


against  E.  coif)  and  selective  antibiotic.  As  far  as  possible  all  manipulation  should  be  carried 
out  under  anaerobic  conditions. 


2.2  GENE  TRANSFER  IN  CLOSTRIDIUM  SPOROGENES 
2.2.1  Summary 

A  total  of  20  different  strains  of  C.  sporogenes  have  been  tested  as  potential  recipients  for 
DNA  transfer.  Having  established  that  the  BioRad  Gene  Pulser  gives  the  highest  rate  of 
electrotransformation  in  C.  acetoburylicum  (compared  to  equivalent  equipment  supplied  by 
Jouan,  BTX  and  Flowgen),  attempts  were  made  to  transform  all  strains  with  a  variety  of 
plasmids  and  differing  electrical  parameters.  Pulses  were  undertaken  at  a  constant  voltage 
(1.25  kV)  and  capacitance  (25  /xFD)  but  at  variable  resistance  (100,  200  &  400  ohms).  Under 
these  conditions  the  %  survival  varied  from  46  to  13%.  Plasmid  replicons  employed  were 
either  from  the  Gram-positive,  bread-host-range  plasmid  pAMfil,  or  the  C.  bwyricum  plasmid 
pCBlOl.  Selective  markers  were  the  erm  (Em")  gene  of  pAMBI  or  a  C.  perfringens  tetP 
gene.  No  transformants  were  obtained.  Attempts  to  conjugatively  mobilise  derivatives  of 
these  vectors,  endowed  with  the  RK2  origin  of  transfer  (oriT),  from  E.  coli  to  each  strain  were 
similarly  unsuccessful. 


2.2.2  Results  and  Discussion 
Antibiotic  resistance  profiles  of  strains 

The  successful  introduction  of  an  extrachromosomal  DNA  into  bacteria  requires  that  the 
transformed  cell  acquires  a  detectable  phenotypic  trait.  The  selectable  genetic  markers  most 
commonly  used  are  genes  specifying  resistance  to  antibiotics.  Before  attempting  to  obtain 
transfer  of  plasmids  into  any  particular  strain  of  C.  sporogenes,  it  was  therefore  important  to 
establish  the  antibiotic  resistance  profiles  of  the  strains  to  be  employed.  A  3  ml  volume  of 
molten  H-top  agar  was  inoculated  with  0. 1  ml  of  exponential  phase  cells  (growing  in  2  X  YTG 
media)  and  overlayed  onto  a  2  X  YTG  agar  plate.  When  the  inoculated  agar  had  solidified, 
antibiotic-impregnated  filter  discs  (Mast  Laboratories  Ltd)  were  placed  on  the  agar  surface  and 
the  plates  incubated  overnight  at  37”C.  The  qualitative  estimates  of  zones  of  i.ihibition  around 
the  different  type  of  disc  are  indicated  in  Thble  3.  These  show  that,  with  the  notable  exception 
of  streptomycin  (Sm)  ani  novobiocin  (Nc),  the  20  strains  tested  exhibited  varying  degrees  of 
sensitivity  to  all  the  antibiotics  tested.  Of  particular  importance  was  the  demonstrable 
susceptibility  of  every  strain  to  erythromycin  (Em),  chloramphenicol  (Cm)  and  tetracycline 


(Tc).  Genes  specifying  resistance  to  these  three  antibiotics  form  the  basis  of  all  currently 
available  clostridial  vectors  (Young  et  al.,  1989;  Rood  and  Cole,  1991). 


Plasmid  screening 

In  parallel  to  the  above  tests  each  strain  was  screened  for  the  presence  of  indigenous 
extrachromosomal  elements  using  a  plasmid  isolation  procedure  routinely  used  in  this 
laboratory  for  analysing  transformants  of  C.  acetobutylicum  (MATERIALS  AND 
METHODS).  The  cell  lysates  obtained  were  electrophoresed  on  1.4%  (w/v)  agarose  gels  in 
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Table  3.  Susceptibility  of  C.  sporogenes  strains  to  various  antibiotics 


-  »  no  zone  of  inhibition;  +  =  zone  up  to  10  mm  in  diameter,  +  +  =  zone  of  11-20  nun;  +  +  •+■  zone  >20 
mra  (all  zones  include  the  disc  diameter  of  6.5  nun).  Antibiotic  abbreviations  are  Cm,  chloramphenicol;  Em, 
erythromycin;  Fu,  fusiciic  acid;  Me,  methicillin;  Nc,  novobiocin;  Pc,  penicillin;  Sm,  streptomycin;  Tc, 
tetracycline;  Cf,  cefoxitin;  Mn,  metronidazole,  and;  Cl,  clindamycin.  Antibiotic  concentrations  are  given  in 
subscripts  ,  following  each  abbreviation,  in  pg  per  mi. 


addition  to  the  standard  0.8%  (w/v)  gels  normally  employed  in  plasmid  analysis.  This  higher 
concentration  of  agarose  ensures  that  any  circular  DNA  species,  "masked"  by  chromosomal 
DNA  on  a  0.8%  gel,  migrates  substantially  slower  than  chromosome  and  is  therefore  easily 
visualised  (Minton  et  al.,  1983a).  No  evidence  for  the  presence  of  plasmid  DNA  was  found  in 
the  lysates  of  any  of  the  20  strains.  In  a  further  series  of  experiments  the  methods  of  Roberts 
et  al.  (1986)  and  Weickert  et  al.  (1986),  were  employed.  These  procedures  have  previously 
been  used  to  detect  plasmid  DNA  in  Clostridium  perfringens  and  Clostridium  absonum,  and  in 
Clostridium  botulinum  Type  A  strains,  respectively.  Although  both  methods  proved  applicable 
to  a  control  C.  acetobutylicum  NCIB  8052  culture  containing  pCB3,  no  plasmids  were  detected 
in  the  lysates  of  any  of  the  C.  sporogenes  strains  under  investigation. 


Evaluation  of  various  electroporators 

Since  the  development  of  our  original  procedure  for  effecting  the  introduction  of  plasmid 
DNA  into  C.  acetobutylicum  using  a  BioRad  Gene  Pulser  (Oultram  et  al.,  1988a),  a  number  of 
other  manufacturers  have  brought  alternative  machines  onto  the  market.  It  was  therefore 
considered  timely  to  undertake  a  comparative  evaluation  of  more  recent  apparatus,  on  the 
assumption  that  an  increase  in  transformation  frequencies  may  accrue.  Three  such  machines 
were  tested,  alongside  the  BicRad  Gene  Pulser,  for  their  efficiency  in  transforming  C. 
acetobutylicum  NCIB  8052  with  plasmid  pMTL500E  (see  Fig.  15).  The  BTX  electroporator 
may  be  essentially  viewed  as  equivalent  in  specification  to  the  BioRad  apparatus.  The  Jouan 
electropulser  differs  from  other  commercially  available  apparatii  in  that  it  generates  a  square 
wave  pulse,  which  theoretically  provides  a  constant  field  during  discharge.  The  Flowgen 
Cellject  resembles  the  BioRad  and  BTX  machine,  in  that  it  discharges  an  exponential  wave,  but 
differs  in  the  facility  for  discharging  a  preprogrammed  second  pulse,  immediately  after  the 
first. 


Electroporator 

Transformation  Frequency  (per  jig  DNA) 

BioRad  Gene  PuLser 

1.2  X  10^ 

BTX  Electroporator 

0.8  X  10^ 

Flowgen  Cellject 

0.5  X  10^ 

Jouan  Electropulser 

0 

Table  4.  C.acetobiOylicum  traruformasion  frequencies  employing  different  electroporation  apparatus. 


Each  machine  was  tested  over  a  range  of  pulse  parameters.  With  those  machines  that  did 
mediate  transformation,  however,  these  parameters  were  essentially  equivalent  to  those  (1.25 
kV,  100  ohms,  25  /^FD)  which  gave  the  highest  levels  of  DNA  transfer  with  the  routinely  used 
BioRad  Gene  Pulser,  viz.,  identical  for  the  BTX,  and  1.25  kV,  90  ohms  and  40  ^tFD  for  the 
CeUject.  In  the  case  of  the  Jouan  Electropulser  no  transformants  were  obtained  under  any  of 
the  conditions  employed.  Indeed,  the  machine  appeared  incapable  of  effecting  DNA  transfer 
even  into  E.  coli.  This  failure  would  appear  to  have  been  largely  due  to  the  ineffective 
electroporation  chamber  supplied  with  the  apparatus,  which  was  cumbersome  to  use  and 
suffered  from  sample  leakage.  The  other  two  machines  both  proved  effective  in  eliciting 
transformation  of  C.  acetobutylicum  NCIB  8052  (Table  4).  However,  under  optimum 
conditions,  use  of  the  BioRad  machine  consistently  resulted  in  the  highest  transformation 
frequencies.  The  subjection  of  cell  suspensions  to  a  second  pulse,  of  various  magnitudes, 
using  the  Cellject  gave  a  slight  increase  (c.  10%-20%)  in  the  number  of  transformants,  but  the 
frequency  obtained  was  significantly  lower  than  those  achieved  with  the  BioRad  apparatus.  It 
was  concluded  that  the  electroporators  of  other  manufacturers  offered  no  advantage  over  the 
BioRad  Gene  Pulser,  and  this  apparatus  was  used  in  all  subsequent  experiments  with  C. 
sporogenes. 


Attempted  electrotransformation  of  strains  of  C.  sporogenes. 

Prior  to  attempting  the  tmnsformation  of  any  particular  strain  of  C.  sporogenes,  it  was  of 
interest  to  estimate  the  effect  of  electrical  pulses  on  cell  viability.  Cell  suspensions,  prepared 
as  for  C.  acetobutylicum,  were  therefore  divided  in  two,  and  one  fraction  subjected  to  pulses  of 
various  magnitudes  before  serial  dilutions  of  both  cell  fractions  were  plated  onto  2  X  TYG 
agar.  From  the  viable  colony  count  obtained  with  the  two  cell  fractions  it  was  possible  to 
estimate  the  %  cell  survival  after  each  pulse.  Some  representative  data  is  shown  in  Thble  5. 


STRAIN 
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200  ohms 
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BM178I 

46 

19 

16 

BM2131 

40 

22 

14 

BM1706 

45 

20 

21 

BM1759 

38 

18 

15 

BM1091 

37 

19 

13 

NaB  8052 

8.5 

3.5 

1.05 

Tabto  5.  Percmtage  survival  of  C.  sporogenes  cells,  compared  to  C.  acetobutylicum  NCIB  8052 
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The  results  obtained  indicated  that  all  ihe  C  sporogenes  strains  under  investigation  exhibited  a 
similar  levels  of  fragility  with  respect  to  the  pulse  applied.  It  was  further  apparent  C. 
sporogenes  is  generally  a  more  robust  organism  than  C.  acetobutylicum. 

These  experiments  established  that  the  field  strength  applied  was  having  some  effect  on  cell 
viability.  However,  as  there  are  no  hard  and  fast  rules  as  to  the  level  of  cell  survival  most 
appropriate  for  successful  transformation,  attempts  to  transform  the  20  strains  of 


[EcoRV/Hhal] 

Fig.  11.  The  Clostridium/ E.coU  shuttle  vector  pMTLSOOET.  Constructed  by  isolating  a  2.9  kb 
Sstl-Pst\  fragment  encoding  tetP  from  the  C.  perfringens  plasmid  pJIR71  (Rood  and  Cole, 
1991)  and  inserting  it  between  the  equivalent  sites  of  pMTLSOOE  (Oultram  et  al.,  1988a). 


C.  sporogenes  were  undertaken  using  a  pulse  of  constant  voltage  (1.25  kV)  and  capacitance 
(25  /iFD),  but  at  the  three  different  resistances  employed  in  the  cell  viability  experiments,  viz., 
100,  200  and  400  ohms.  The  plasmids  employed  were,  pCB3  (Young  et  al.,  1989),  pMTL520 
(Minton  et  al.,  1990a)  and  pMTL500ET  (Fig.  11).  Plasmid  pMTL500ET  is  based  on  the 
replicon  of  the  Enterococcal  faecalis  plasmid  pAM81,  widely  recognised  as  possessing  an 
extremely  broad  host  range  amongst  Gram-positive  bacteria.  Plasmids  pMTL520  and  pCB3 
utilise  the  replicon  of  the  Clostridium  buryricum  plasmid  pCBlOl  (Minton  and  Morris,  1981). 
The  selective  marker  of  pCB3  is  the  pAMBl  erm  gene  (Em*),  that  of  pMTL520  the 
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Clostridium  perfringens  tetP  gene  (Tc^),  while  pMTL500ET  specifies  both  resistance  genes. 

In  the  vast  majority  of  cases,  no  C.  sporogenes  colonies  resistant  to  either  Tc  or  Em  were 
obtained.  A  number  of  putative  transformants  did  result  from  experiments  involving  BM1769, 
1776,  1783  and  1706,  and  the  plasmid  pMTLSOOET,  and  BM1783  and  the  plasmid  pCB3. 
Subsequent  small  scale  isolation  procedures  undertaken  on  representative  colonies  failed  to 
reveal  the  presence  of  extrachromosomal  DNA  in  the  resultant  clea.'ed  lysates.  Furthermore, 
the  lysates  from  the  putative  pMTLSOOET  were  incapable  of  transforming  competent  E.  ccli  to 
Ap*.  In  contrast,  8  Ap*^  transformants  were  obtained  using  the  BM1783  lysate  derived  from 
the  putative  pCB3  transformant.  Although  all  8  E.  coli  transformants  were  shown  to  contain 
plasmid  DNA,  only  1  gave  a  restriction  pattern  characteristic  of  pCB3.  In  further  tests, 
radiolabelled  pCB3  DNA  was  used  in  a  Southern  blot  experiment  against  total  DNA  isolated 
from  the  putative  BM1783  transformant.  No  positive  signal  was  detected. 

Attempts  to  obtain  further  transformants  of  either  of  the  5  C.  sporogenes  strains  proved 
unsuccessful.  This  included  experiments  in  which  the  strains  were  grown  in  media  containing 
2%  glycine,  prior  to  the  preparation  of  "competent"  cell  suspensions.  Electrotransformation 
as  a  means  of  eliciting  DNA  transfer  was  therefore  abandoned  in  favour  of  conjugative 
procedures. 


Conjugative  DNA  transfer 

The  ability  of  IncP  plasmids  to  effect  the  mobilisation  of  co-resident  cloning  vectors  from 
an  E.  coli  donor  to  a  variety  of  Gram-positive  recipient  is  now  well  documented  (Trieu-Cuot  et 
al.,  1987).  Indeed,  previous  studies  have  shown  that  when  pMTL500E  or  pCB3  is  endowed 
with  the  transfer  origin  of  the  IncPII  plasmid  RIG  {oriT),  then  conjugative  transfer  of  the 
resultant  plasmids  (pCTCl  and  pCTCSOl,  respectively)  was  demonstrable  between  a  Tra"*" 
(RIG)  E.  coli  donor  and  C.  acetobutylicim  NCIB  8052  (Williams  et  al.,  1990a  &  b).  To  test 
the  applicability  of  this  method  to  C.  sporogenes  all  20  strains  were  used  as  recipients  in  filter 
matings  using  the  Tra"^  E.  coli  Iiost  SM17  containing  either  pCTCl  of  pCTCSOl.  Strains  were 
examined  in  batches  of  5,  and  every  experiment  included  a  filter  mating  employing  C. 
acetobutylicum  NCIB  8052  as  the  control.  In  no  instance  were  any  Em*^  colonies  recovered 
from  a  mating  involving  a  C.  sporogenes  as  the  recipient.  In  contrast,  in  every  batch  of 
matings,  the  C.  acetobutylicum  control  experiment  consistently  gave  Em*^  transconjugants. 


2.3  AN  EXPRESSION  SYSTEM  FOR  CLOSTRIDIUM  ACEWBUTYLICUM 


2.3.1  Summary 

The  inability  to  effect  the  transfer  of  plasmid  DNA  to  any  strain  of  C.  sporogenes  has  led 
to  the  adoption  of  C.  acetobutylicwn  NCIB  8052  as  the  proposed  host  for  production  of  BoNT 
toxoid.  Efforts  have  focused  on  deriving  a  regulated  expression  system  based  on  the 
previously  constructed  fac  promoter,  composed  of  the  transcriptional  initiation  signals  of  the 
C.  pasteurianum  ferredoxin  gene  in  which  a  synthetic  lac  operator  sequence  has  been  inserted 
immediately  3'  to  the  +1  nucleotide.  Using  an  E.  coli  model  system,  transcription  from  fac 
has  been  shown  to  be  subject  to  the  regulatory  control  of  the  E.coli  lad  gene  product. 
Subsequent  efforts  have  concentrated  on  attempting  to  obtain  lad  expression  in  C. 
acetobutylicum  NCIB  8052.  Insertion  of  a  lad  gene  derivative,  which  had  been 
transcriptionally  coupled  to  a  Gram-positive  vegetative  promoter,  into  the  fac  based  expression 
vector  pMTL500F  proved  impossible,  due  to  structural  instability.  Attempts  to  construct  a 
second  plasmid  compatible  with  pMTLSOOF,  into  which  lad  may  be  inserted,  have  been 
hampered  by  lack  of  an  alternative  select''.ble  marker  to  that  (erm)  carried  by  pMTL500F. 
Although  tetP  appeared  to  express  in  C.  acetobutylicum,  transformants  could  not  be  directly 
selected  on  the  basis  of  Tc’'.  Selection  for  acquisition  of  a  cat  gene  on  the  basis  of 
thiamphenico!  resistance  also  proved  inappropriate  with  this  clostridial  strain.  The  cloning  of 
restriction  fragments  encoding  staphylococcal  determinants  of  Ap  resistance  resulted  in 
structural  rearrangements/  deletions  in  the  resultant  recombinant  plasmids.  Progress  was  made 
in  devising  a  means  of  targeting  DNA  to  the  host  chromosome.  Evidence  was  obtained 
suggesting  that  a  replication-impaired  plasmid  (pMTL513E)  carrying  an  internal  portion  of  the 
C.  acetobutylicum  gutD  gene  readily  becomes  integrated  into  the  host  chromosome.  This 
observation  opens  up  the  route  for  obtaining  a  C.  acetobutylicum  derivative  in  which  the  lad 
gene  is  integrated  into  the  chromosome,  preferably  by  recombinational  events  involving  double 
cross-overs.  In  the  absence  of  antibiotic  resistance  markers,  selection  for  such  an  event  may 
be  achieved  by  cointegration  of  the  C.  pasteurianum  leuB  gene  in  a  NCIB  8052  LeuB' 
derivative,  SBA9. 


2.3.2  Results  and  Discu-ssion 


Transcription  from  the  clostridial  fac  promoter  is  regulated  by  Lad 

The  failure  to  achieve  demonstrable  DNA  transfer  into  any  of  the  C.  sporogenes  strains 
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tested  necessitated  the  use  of  C.  acetobutylicum  as  an  alternative  host.  This  clostridial  species 
has  a  number  of  advantages  over  C.  sporogenes.  On  a  practical  level,  we  have  already 
developed  the  necessary  means  of  manipulating  this  species.  Equally  as  important,  this  species 
lias  no  known  association  witli  human  disease  and  should  therefore  command  a  lower  Access 
factor  in  any  proposed  recombinant  experiments.  The  proposed  expression  of  BoNT  gene 
subfragments  can  therefore  be  undertaken  at  a  lower  category  of  containment.  Furthermore, 
parallel  studies  undertaken  in  this  laboratory  have  resulted  in  the  constructioii  of  an  expression 
cartridge,  similar  to  that  proposed  for  the  C.  sporogenes  rrn  promoter,  based  on  the 
transcriptional  signals  of  the  ferredoxin  (Fd)  gene  of  Clostridium  pasteurianum  (Minton  et  al. , 


lacZ' 

r<l  lac  Fd  ^ 

-t  PROMOTER  ->-[  OPERATOR  |  -  r~RBS  I -TAT  aTG- 


Fig.  12.  Tht  Clostridium  acetobutylicum  expression  vector,  pMTLSOOF.  Plasmid  pMTL-SOOF 
was  constructed  by  replacing  the  lac  po  region  of  pMTL500E  with  the  indicated  modified  (see 
Minton  et  al.,  1990a)  Fd  promoter.  During  its  derivation,  plasmid  pMTL500F  also  acquired  the 
pSClOl  stability  function,  par  (PAR).  The  ATG  tri-nucleotide  of  the  indicated  Ndel  restriction 
recognition  site  correspwnds  to  the  AUG  translaiional  start  codon  of  iacZ\  and  is  immediately 
preceded  by  the  Fd  ribosome  binding  site  (RBS).  The  multiple  cloning  sites  (MCS)  are  those  of 
pMTL20  (Giambers  et  al.,  1988). 


1990a).  This  expression  cartridge  was  shown  to  direct  the  expression  of  the  pC194  cat  gene 
such  that  the  encoded  protein  represented  between  3  and  7%  of  the  cells'  soluble  protein 


(Minton  et  al.,  1990b)  In  more  recent  studies  this  promoter  has  been  modified  by  the  precise 
insertion  of  an  E.  coli  lac  operator  sequence  at  the  Fd  + 1,  and  the  resultant  promoter 
derivative  (designated /ac)  inserted  into  pMTLSOOE  in  place  of  the  natural  promoter  of  the 
lacT  gene.  Thus,  in  the  derived  plasmid,  pMTLSOOF  (Fig.  12),  expression  of  laeZ'  is  under 
the  transcriptional  control  oifac  (Minton  et  al.,  1990b). 


Fig.  13.  IndudbU  exprusion  of  the  pCI94  cat  gene  cloned  in  pMTLSOOE  and  pMTLSOOF.  A 
promoterless  copy  of  the  pC194  cat  gene,  excised  froiu  pMTL20C  (Swinfield  et  aJ.,  1990)  as  a 
0.8  kb  Mnll  fragment,  was  inserted  into  the  5/naI  jife  of  pMTL500E  and  pMTL500F,  such  that 
transcription  was  dependent  on  the  lac  or  Fd  promoter,  respectively.  The  two  recombinant 
plasmids  were  independently  introduced  into  E.  coli  TGI  containing  the  feic/’-encoding  plasmid 
pNM52  (Gilbert  et  al.,  1986),  and  the  two  clones  grown  in  2XYT  broth  to  sa  of  0.6.  At 
this  prnnt  expression  was  induced  by  addition  of  IPTG  (indicated  by  an)  to  a  fin.al  concentration 
of  1  mM,  CAT  activity  of  cells  carrying  pMTLSOOE  (C)  or  pMTLSOOF  (H)  is  expressed  as  % 
cell  soluble  protein.  The  culture  of  cells  carrying  pMTI^SOOE  and  pMTLS00.P  is 

indicated  by  (O)  and  (□),  respectively. 


The  presence  of  the  lac  operator  should  enable  transcription  from  fac  to  be  blocked  by 
binding  of  the  Lad  protein.  Derepression  may  subsequently  be  achieved  by  the  addition  of  the 
inducer  IPTG.  Such  inducibility  requires  that  the  lad  gene  is  efficiently  expressed  in  the 
recombinant  host  employed.  In  our  preliminary  studies  the  pC194  cat  gene  was  inserted  into 
pMTLSOOF  and  the  resultant  plasmid  introduced  into  an  E.coU  host  which  carried  the  lad  gene 
on  a  co-resident,  compatible  pla.smid,  pNM52  (Gilbert  et  al.,  1986).  When  cells  carrying  both 
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plasmids  were  grown  overnight  in  the  presence  or  absence  of  IPTG,  significant  repression  of 
cat  expression  was  evident.  Thus  non-induced  cells  synthesised  CAT  to  levels  of  approx. 
1.0%  of  the  cells'  soluble  protein,  compared  to  the  13%  levels  attained  in  cells  supplemented 
with  IPTG. 

A  clearly  idea  of  the  degree  of  xepression/inducibility  was  obtained  by  undertaking  the 
experiment  outlined  in  Fig.  13.  Cells  carrying  pNM52  and  either  pMTLSOOEca/  or 
pMTL500Fcflr  were  grown  in  rich  media  to  mid  exponential  phase  when  IPTG  was  added  to 
both  cultures,  at  a  final  concentration  of  1.0  mM.  It  can  be  seen  that  prior  to  induction  no 
CAT  activity  was  detectable.  Following  the  addition  of  IPTG,  rapid  induction  of  cat 
expression  was  evident.  Most  encouragingly  the  degree  of  repression/  induction  exhibited  by 
the  natural  lac  promoter  (pMTLSOOcar)  and  the  fac  promoter  (pMTLSOOFcfl/)  was  identical. 


Towards  expression  of  loci  in  C.  acetobutylicum 

Having  established  that /ac  can  be  regulated  by  Lad  repressor  protein,  efforts  focused  on 
effecting  expression  of  this  gene  in  C.  acetobutylicum  NCIB  8052.  Previous  workers  have 
elicited  expression  of  lad  in  the  Gram-positive  bacterium  B.  subtilis  by  coupling  tran.scription 
to  a  Bacillus  vegetative  promoter  and  inserting  the  modified  gene  either,  into  the  backbone  of 
the  expression  vector  itself  (pREP8),  or  into  a  second  compatible  plasmid  (LeGrice  et  al., 
1987).  Therefore,  initially  we  attempted  to  insert  a  pREP8-derived  lad  encoding  DNA 
fragment  into  the  specially  constructed  unique  EcoBN  site  of  the  expression  vector 
pMTLSOOF.  Accordingly  a  1.4  kb  Eco^l-Pvul  fragment  carrying  lac!  was  excised  from 
pREP8),  blunt-ended  by  treatment  with  T4  DNA  polymerase  and  ligated  to  EcoBN  cleaved 
pMTL500F.  Subsequent  analysis  of  the  recombinant  plasmids  obtained,  however,  indicated 
that  severe  structural  rearrangements  had  occurred. 

The  alternative  strategy  of  inserting  this  gene  into  a  second  co-resident  plasmid  requires  the 
availability  of  a  plasmid  which  is  not  only  compatible  with  regard  to  replication  apparatus  (ie., 
different  replicon),  but  in  addition,  to  prevent  intermolecular  recombination,  should  not 
possess  DNA  homology.  NVe  have  previously  constructed  (Minton  et  al.,  1988)  such  a 
prototype  veclor(pMTL520)  which,  with  reference  to  pMTLSOOF,  fulfils  all  these  criteria. 
Thus  whereas  pMTL5(X)F  is  based  on  the  E.  coli  ColEl  replicon,  pMTL520  utilises  the  pl5a 
replicon.  Similarly,  pMTLSOOF  uses  the  pAMSl  replicon  and  erm  gene,  whereas  pMTL520 
makes  use  of  the  pCBlOl  replicon  and  tetP  from  a  C.  perfringens  R-factor.  However, 
repeated  attempts  to  transform  C.  acetobutylicum  to  Tc*^  (10  /xg/ml)  were  unsuccessful,  raising 
doubts  as  to  the  suitability  of  pMTL520  for  use  in  C.  acetobutylicum. 


50 


The  tetP  gene  cannot  be  used  as  a  selective  marker  in  C.  acetobutylicum 


Two  explanations  may  be  evoked  to  explain  the  inability  of  pMTL520  to  transform  C. 
acetobutylicum.  Either:  (i)  although  pMTL520  confers  resistance  to  Tc  on  E.coli  hosts,  the 
tetP  gene  does  not  function  in  C. acetobutylicum,  or;  (ii)  the  pCBlOl  replicon  became 
inactivated  during  the  construction  of  the  vector.  To  clarify  the  situation  a  second  plasmid  was 
constructed  by  inserting  the  rerP  gene  into  pMTLSOOE  (Fig.  11).  This  new  plasmid, 
pMTLSOOET,  therefore  encodes  both  erm  and  tetP.  Confirmation  that  both  antibiotic 
resistance  genes  function  in  a  Gram-positive  host  was  obtained  by  transforming  B.  subtilis, 
where  it  proved  possible  to  select  for  transformants  either  on  the  basis  of  Em*  of  Tc*. 
Transformation  of  C.  acetobutylicum  was  then  repeated  using  pMTLSOOET  DNA  with 
selection  on  plates  either  containing  Em  (10  ugltnX)  or  Tc  (10  ;ig/ml).  Transformants  were 
only  obtained  on  the  former  plates.  Furthermore  these  Em*  transformants  subsequently  failed 
to  grow  on  agar  medium  containing  10  /xg/ml  Tc. 


TETRACYCLINE 

GROWTH  of  NCIB  8052 

CONCENTRATION 

Plasmid- free 

-f-  pMTLSOOET 

0 

-(--b-t- 

-b-H- 

1  pg/wi 

- 

+  -H-f 

2.5  pg/ml 

- 

-h  + 

5  pg/ml 

- 

+ 

10  ;jg/tnl 

- 

- 

Table  6.  Growth  of  NCIB  8052  and  a  pMTLSOOET  tran.tfomuint  on  media  supplemented  with  Tc 


The  inability  of  pMTLSOOET  Em*  transformants  to  grow  on  plates  containing  Tc  prompted 
an  examination  of  the  level  of  susceptibility  of  C  acetobutylicum  to  this  antibiotic  over  a  range 
of  concentrations.  The  results  are  illustrated  in  Table  6.  C.  acetobutylicum  was  found  to  be 
incapable  of  growth  at  levels  as  low  as  1  ng/ml.  In  contrast,  a  pMTLSOOET  transformant 
(selected  on  the  basis  of  resistance  to  Em)  was  capable  of  normal  growth  at  this  level  of 
antibiotic,  reduced  growth  at  Tc  concentrations  of  2.5  pg/ml  and  sparse  growth  on  agar 
containing  5  A^g/rnl  Tc.  The  transformation  experiment  with  pMTLSOOET  was  therefore 
repeated,  with  selection  on  plates  containing  1.0  and  2.5  pg/ml  of  Tc.  Although  Tc*  colonies 
were  obtained  at  both  concentrations,  an  almost  equivalent  number  were  obtained  using  cells 
which  received  no  plasmid  DNA.  Furthermore,  replica  plating  of  the  putative  transformants 
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onto  agar  media  supplemented  with  Em  revealed  that  no  colony  had  become  Raising  the 
level  of  Tc  in  agar  plates  to  4  ^g/ml  appeared  to  prevent  any  growth  of  cells  which  did  not 
receive  plasmid  DNA.  However,  the  low  number  of  colonies  obtained  from  cells  treated  with 
pMTLSOOET  DNA  were  all  found  to  be  Em*,  indicating  they  were  not  true  transformants. 
Indeed,  no  extrachromosomal  DNA  was  evident  when  appropxiate  cleared  lysates  were 
analysed  by  agairose  gel  electrophoresis.  It  was  concluded  that,  although  tetP  appears  to  confer 
Tc*  on  C.  acetobutylicim  once  the  plasmid  carrying  it  has  become  established  in  the  cell,  it  is 
not  possible  to  directly  select  for  Tc*  in  transformation  experiments. 


Alternative  selective  markers 

Because  the  availability  of  only  one  selective  marker  {erm)  places  severe  limitations  on  any 
future  recombinant  manipulations  in  C.  acetobutylicim,  the  elucidation  of  a  second  marker  is  a 
matter  of  some  importance.  Reliance  on  commonly  used  genes  specifying  resistance  to  Cm 
and  Km  have  previously  proven  inappropriate  for  this  Clostridium  spp.  (see  Oultram  et  al., 
1987).  Some  authors  have  circumvented  the  problems  associated  with  the  anaerobic  reduction 
of  chloramphenicol  by  using  thiamphenicol,  eg.,  in  Clostridium  thermohydrosulfuricum 
(Soutschek-Bauer  et  al.,  1985).  The  possibility  of  using  this  analogue  for  selection  of 
pMTL500Fcar  transformants  was  therefore  explored.  As  growth  experiments  demonstrated 
that  C.  acetobutylicum  NCIB  8052  grew  on  levels  of  thiamphenicol  up  to  and  including  100 
Hg/m\,  a  concen:  ation  of  150  /xg/ml  was  used  in  selective  plate.  However,  although 
pMTL500Fca/  electrotransformants  could  be  readily  selected  on  the  basis  of  Em*,  no  colonies 
were  obtained  on  the  plates  containing  thiamphenicol. 

A  further  potential  marker  gene  that  could  be  employed  is  a  gene  specifying  resistance  to 
ampicillin.  Such  a  determinant  encoding  a  typical  "pBR322-like"  B-lactamase  from  a 
Staphylococcus  aureus  plasmid  (pSl)  has  recently  been  sequenced  (East  and  Dyke,  1989). 
However,  although  the  sequenced  bla  gene  alone  is  sufficient  for  Ap*  in  E.  coli,  a  region  of 
DNA  5'  to  the  gene  is  required  for  resistance  in  staphylococci.  A  plasmid  carrying  the  whole 
determinant  necessary  for  Ap*  in  a  Gram-positive  bacterium,  pAE306,  was  obtained  from  Dr 
K  Dyke  at  Oxford  and  a  4.0  kb  fragment  excised  and  inserted  into  pMTL520.  Although  the 
resultant  plasmid  conferred  Ap*  on  an  E.  coli  host,  Ap*  transformants  of  C.  acetobutylicum 
were  not  obtained.  Subsequent  dialogue  with  Dr  Dyke's  laboratory  indicated  that 
rearrangements  of  the  Ap*  determinant  of  plasmid  pAE306  had  occurred.  A  second  plasmid 
was  therefore  obtained,  pSU104,  which  carried  the  entire  Tn552  transposon,  encompassing 
blaZ,  on  a  6.0  kb  Bamlll  fragment.  However,  attempts  to  insert  this  fragment  into  the 
polylinker  site  of  pMTL500E  consistently  resulted  in  recombinant  plasmids  in  which  structural 
rearrangements/  deletions  were  apparent. 
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The  gut  operon  as  a  potential  site  for  homologous  integration 

The  facility  for  effecting  the  insertion  of  heterologous  DNA  into  the  host  chromosome,  by 
homologous  recombination,  offers  considerable  advantages  in  any  proposed  programme  of 
strain  manipulation.  The  principal  attraction  is  that  it  circumvents  the  problems  of 
recombinant  segregational  instability  commonly  associated  with  autonomous  vectors.  Thus, 
the  ability  to  integrate  genes  into  the  C.  acetobuiylicum  chromosome  offers  great  potential  in 
the  future  generation  of  strains  expressing  bot  gene  subfragments.  Such  technology,  however, 
also  provides  the  facility  for  generating  a  strain  in  which  lad  has  been  inserted  into  the 
chromosome. 

Integrative  strategies  require  two  components:  (i)  a  cloned  region  of  the  host  chromosome, 
to  provide  homology  for  recombination,  the  disruption  of  which  is  not  deleterious  to  cell 
growth,  and;  (ii)  a  vector  delivery  system,  the  replication  properties  of  which  favour 
integration.  We  have  just  completed  sequencing  the  gut  operon  of  C.  acetobutylicum,  which 
encodes  the  genes  necessary  for  glucitol  (sorbitol)  transport/metabolism.  The  operon  (Fig.  14) 
has  the  same  overall  arrangement  as  E.coli  (Yamada  and  Saier,  1987),  but  additionally 
contains  a  gene  coding  for  a  protein  exhibiting  homology  to  the  ORF  U  polypeptide  of 

Clostridium  acetobutylicum 


gut  A1 


gufA2 


orfX 


guts 


goto 


Escherichia  coU 
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Bacillus  subtilis 
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ortX  Y  ty 
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Fig.  14.  The  arrangement  of  genes  in  the  C.  acetobutylicum  gut  operon,  and  the  position  of 
equivalent  genes  in  E.  coli  and  Rsuhtilis.  The  encoded  polypeptide*?  of  similarly  shaded  ORFs 
exhibit  amino  acid  homology.  The  encoded  enzymes  are;  gutA,  PTS-ll*^;  guiS,  Enzyme  IlF"*; 
gutD,  glucitol-6-P  dehydrogena.se,  and;  orfU  &  orfX  (C.  acetobutylicum),  transalaolase.  A 
sequence  error  in  the  illustrated  B.  subtilis  region  means  that  orfY  and  tsr  form  only  I  ORF, 
and  encodes  aldola.se  (J  Cary,  personal  communication). 
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B.  subtilis  (Trach  ct  al.,  1988).  Recently  ORF  U  polypeptide  has  been  shown  to  exhibit  distant 
homology  to  yeast  transaldolase  (J  Clary,  personal  communication),  providing  tentative 
evidence  that  the  C.  acewbutylicwn  ORF  X  gene  product  may  be  transaldolase  (Fig.  14).  ITie 
gutD  gene  (encoding  glucitol  dehydrogenase)  seems  an  ideal  target  for  integration  as  it  is  not 
normally  required  by  the  host,  and  presents  an  easy  test  for  successful  integration,  ie.,  inability 
to  grow  on  sorbitol  as  the  carbon  source. 


Integrative  vectors 

Integrative  vector  are  ideally  based  on  plasmids  which  are  temperature  sensitive  for 
replication.  Such  a  vector,  containing  cloned  region  of  the  host  genome,  may  be  introduced 
into  the  target  cell  and  selected  at  a  temperature  pemissive  for  replication.  Successfully 
transformed  cells  may  then  be  grown  at  the  non-permissive  temperature  in  the  presence  of  the 
antibiotic  to  which  the  vector  confers  resistance.  Under  the  these  conditions  plasmids  are 


Fig.  15.  Cloning  vectors  based  on  the pAMfil  replicon.  All  plasmids  were  generated  by 
insertion  of  the  indicated  pAM81 -derived  DNA  (see  Swinfield  et  al.,  1990)  fragment  (bold  line) 
into  the  Nhel  site  of  pMTL20E  (thin  line).  The  lacZ'  is  therefore  functional  (blue  colonies  in 
the  presence  of  XGal)  unless  inactivated  by  subsequent  insertion  of  heterologous  DNA  into  the 
polylinker  region.  Plasmid  pMTLSOOE  is  a  high  copy  number  plasmid,  while  pMTL502E  and 
-pMTL513E  have  a  low  copy  number.  The  general  purpose  cloning  vectors  pMTLSOOE  and 
pMTL502E  exhibit  moderate  segregations!  stability.  Plasmid  pMTL513E  exhibits  extreme 
instability  in  both  B.  subtilis  and  C.  acetohutylicum  (see  Table  7). 
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rapidly  lost  from  the  population  with  the  result  that  the  only  cells  which  can  grow  in  the 
presence  of  the  antibiotic  are  those  in  which  chromosomal  integration  of  the  plasmid  element 
occurs.  With  this  in  mind,  attempts  have  been  made  to  isolate  a  temperature-sensitive 
replication  mutant  of  pMTLSOOE  (Fig.  15)  by  in  vitro  mutagenesis.  Plasmid  DNA  was 
incubated  with  hydroxylamine,  as  previously  described  (Minton,  1984),  and  the  resultant 
damaged  DNA  used  to  transform  E.  coli  cells  to  Ap*^.  Total  transformant  colonies  were  then 
pooled  (by  flooding  the  agar  plates  with  media),  bulk  plasmid  DNA  prepared  and  used  to 
transform  B.  subtilis  to  Em*^  at  28°C.  Colonies  obtained  were  then  replica  plated  onto  fresh 
plates  and  grown  for  24  h  at  42°C.  To  date  approx.  5,000  B.  subtilis  colonies  have  been 
screened  in  this  manner,  but  only  one  putative  ts  mutant  has  been  isolated.  Subsequent 
characterisation  of  this  transformant,  however,  indicated  that  ts  defect  resided  in  the  adenine 
methylase  enzyme  {erm  gene).  Screening  is  continuing. 

In  parallel  to  the  above,  integrative  experiments  have  proceeded  with  the  replication- 
impaired  vector  pMTL5 13E  (Fig.  15).  This  vector  was  derived  by  replacing  the  pAMBl 
replication  region  of  pMTL500E  with  the  pAM81  replicon  of  plasmid  pMTL20CB13 
(Swinfield  et  al.,  1990).  Because  this  replicon  contains  a  deletion  which  extends  into  the 
replication  origin,,  the  efficiency  of  replication  is  severely  impaired.  Thus,  in  the  presence  of 
the  selective  antibiotic  B.  subtilis  cells  carrying  this  plasmid  exhibit  a  4-fold  increase  in 
doubling  time,  while  in  the  absence  of  selective  pressure  plasmid-free  segregants  arise  at  an 
extremely  high  frequency  (Swinfield  et  al.,  1990).  A  336  bp  Nhel-Spel  restriction  fragment, 
internally  located  within  the  gutD  structural  gene  were  therefore  cloned  into  the  polylinker  of 
pMTL513E  at  its  unique  Xbal  site.  The  plasmid  obtained,  pJEN2,  was  transformed  into  C. 
acetobutylicum  NCIB  8052  and  Em*^  transformants  selected.  Interestingly,  pJEN2  transformed 
C.  acetobutylicum  at  a  significantly  higher  frequency  (5-fold  higher)  than  the  progenitor 
vector,  pMTL513E,  presumably  as  a  result  of  carrying  a  homologous  chromosomal  DNA 


PLASMID 

%  OF  CELLS  RESISTANT  TO  ERYTHROMYCIN 

10  generations 

20  generations 

pMTL531E 

99.5 

99 

pMTLSOOE 

66 

44 

pJEN2 

0.4 

0.01 

CHR;:pJEN2 

100 

100 

T^Ie  7.  Segregational  instability  of  pMTL5I3E  during  growth  of  C,  acrtobutyliam  in  the  absence  of 
antibiotic  selection.  Cells  were  grown  in  2  X  YTG  for  10  and  20  generations  and  the  %  of  cells  no 
longer  Em**  estimated  by  deriving  colony  viable  counts  on  media  with  and  without  Em.  For 
comparative  purposes,  the  results  with  pMTLSOOE  and  a  stabilised  derivative,  pMTL531E  (Swinfield  et 
al.,  1991),  are  shown.  Preliminary  results  indicate  the  putative  integrant  (CHR::pJEN2)  is  100%  stable. 


insert.  A  transformant  containing  the  plasmid  was  then  grown  for  50  generations  in  the 
absence  of  antibiotic  selection,  before  Em  was  added  to  the  medium,  the  culture  incubated  for 


Em'’ 


gutB  gutO'  gut'O 


Fig.  16.  Schanatic  representation  of  Cambell-like  integration  of  pMTLSlSE  containing  a  gulD 
subfragment  into  the  C.  acetobutylicum  chromosome. 


a  further  8  hours  and  cells  plated  out  on  agar  medium  containing  Em’’.  As  can  be  seen  from 
Tkble  7,  pJEN2  was  rapidly  lost  from  the  cell  population  in  the  absence  of  selective  pressure. 
Indeed,  when  the  culture  was  plated  out  after  50  generations  only  15  Em"  colony  was 
obtained.  Using  appropriate  minimal  agar  plates,  the  cells  from  all  15  colonies  were 
subsequently  shown  to  be  incapable  of  growth  on  sorbitol  as  the  sole  carbon  source. 
Furthermore  preliminary  experiments  indicated  that  loss  of  Em"  no  longer  occurs  when  the 
cells  are  grown  in  the  absence  of  antibiotic  (Table  7).  Both  observations  strongly  indicate  that 
integration  of  pJEN2  has  occuned  at  the  guiD  gene. 

A  schematic  representation  of  how  pJEN2  could  become  inserted  at  the  guiD  locus  of  the 
C.  acetobutylicum  chromosome  by  Cambell  integration  is  shown  in  Fig.  16.  In  this  scheme  a 
single  recombinational  cross-over  results  in  duplication  of  the  homologous  gutD  gene  segment, 
concommitant  with  inactivation  of  the  chromosomal  copy.  Double  cross-overs  would  not 
inactivate  the  gene,  nor  result  in  a  strain  exhibiting  segregational  stabilisation  of  the  Em" 
determinant.  The  experiments  necessary  to  confirm  that  such  an  event  has  occurred  are 
currently  being  undertaken.  In  addition  to  classical  DNA/DN.\  hybridisation  experiments, 
oligonucleotides  have  been  synthesised  specific  to  sequences  in  pMTL513E  and  gutB.  The 
PCR  amplification  of  a  DNA  fragment  of  the  correct  size  (see  Fig.  16)  will  confirm  integration. 
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CONCLUSIONS 


1]  CXONING  OF  Clostridium  botidiruan  NEUROTOXIN  GENES 

Status:  The  entire  nucleotide  sequences  of  the  botB  and  hotE  genes  have  been 

determined,  and  the  majority  of  t]<e  hotF.  The  recombinant  clone  necessary  to 
elucidate  the  missing  60  codons  from  the  3'  end  of  boi¥\%  available,  a  region 
encompassing  the  missing  20  codons  from  the  5’  end  of  the  gene  has  yet  to  be 
obtained.  Additionally,  only  50%  of  the  currently  available  botF  sequence  has 
been  determined  from  more  than  one  PCR  amplified  fragment.  Initial  attempts 
to  PCR  amplify  regions  of  hotG  have  proven  unsuccessful. 

Future  aims:  The  immediate  objectives  of  this  aspect  of  the  proposal  are  to  complete  the  botF 
cloning/  nucleotide  sequence  determination.  The  missing  5'  end  will  obtained 
by  cloning  an  appropriate  PCR  amplified  fragment.  Oligonucleotides  will  be 
based  on  known  botF  sequence  and  a  sequence  from  the  5'  non-coding  region  of 
botE.  Authentication  of  the  sequence  will  be  performed  by  sequencing 
independently  isolated  duplicate  clones  to  those  already  isolated,  and  where 
necessary,  additional  clones.  Once  completed,  efforts  will  focus  on  botG,  when 
PCR  primers  will  be  synthesised  based  on  nucleotide  sequences  capable  of 
encoding  amino  acid  motifs  which  are  highly  conserved  between  all  6 
characterised  neurotoxins. 

2]  EXPRESSION  SYSTEM  DEVELOPMENT 

Status:  Attempts  to  elicit  the  transfer  of  plasmid  DNA  vectors  into  20  different  strains 

of  Clostridium  sporogenes,  by  either  electro-transformation  of  by  conjugative 
mobilisation,  have  met  with  no  success.  Rather  than  persevere  with  this 
Clostridium  sp. ,  the  genetically  amenable  species  Clostridium  acetobutylicum 
NCIB  8052  has  been  chosen  as  a  alternative  host  for  the  future  expression  of  bot 
gene  subfragments.  Efforts  have  focused  on  imposing  regulatory  control  on  the 
fac  promoter  system  by  seeking  to  obtain  expression  of  lad  in  C. 
acetobutylicum.  Preliminary  attempts  have  been  unsuccessful  largely  due  to  the 
lack  of  a  second  selectable  genetic  marker.  Circumvention  of  this  problem 
should  prove  possible  following  the  demonstration  that  it  is  possible  to  "force" 
the  integration  of  recombinant  DNA  into  the  host  genome. 
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Future  aims:  In  the  immediate  future  we  will  continue  to  address  the  problem  of  a  second 
selectable  marker.  Alternative  cat  and  let  genes  have  been  requested  from  other 
groups  working  in  the  field,  most  significantly  pAMBl-based  vectors  carrying 
the  streptococcal  tetM  gene,  which  has  been  shown  to  be  selectable  in  C. 
acetobutylicum  DSM  1731  (P.  Durre,  personal  communication).  The  highest 
priority,  however,  will  be  given  to  integrative  studies.  The  type  of  integration 
achieved  to  date  (involving  a  single  cross-over)  is  not  ideal,  as  the  duplication  of 
homologous  DNA  will  result  in  excision  of  the  inserted  recombinant  vector  at  an 
equivalent  frequency  to  insertion.  Stable  integrants  require  double  cross-overs, 
in  which  reciprocal  exchange  of  DNA  occurs  between  the  plasmid  and 
chromosomal  copies.  The  necessary  constructs  to  achieve  this  are  currently 
being  constructed  (ie. ,  the  entire  gutD  gene  in  which  a  central  portion  of  the 
structural  gene  is  deleted  and  replaced  with  a  heterologous  gene).  Using  this 
strategy,  we  will  endeavour  to  integrate  a  DNA  fragment  composed  of  laci  and 
the  C.  pasteurianum  leuB  gene,  arranged  in  tandem,  at  the  gutD  locus  of  a  leuB' 
mutant  of  C.  acetobutylicum,  SBA9.  The  C.  pasteurianum  leuB  gene  has 
previously  been  shown  to  convert  this  C.  acetobutylicum  auxotroph  to 
prototrophy  (Oultram  et  al.,  1988a;  1988b).  Thus,  SBA9  cells  transformed  with 
pMTL513E  carrying  DNA  encompassing  gutDy.lacI/leuB  will  initially  be 
selected  on  the  basis  of  Em*^.  Thereafter,  they  will  be  grown  in  the  absence  of 
antibiotic  selection  for  50  generations,  and  then  plated  on  minimal  media  lacking 
leucine.  Leu'*’,  Em®  colonies  which  can  no  longer  grow  on  sorbitol  as  a  carbon 
source  should  represent  clones  in  which  integration  of  lacIlleuB  has  occurred 
by  a  double  cross-over. 

Once  a  C.  acetobutylicum  lad*  host  ha.  been  obtained,  it  should  prove  possible 
to  regulate  the  expression  of  heterologous  genes  which  have  been 
transcriptionally  coupled  to  the/cc  promoter  of  pMTLSOOF.  At  this  stage  we 
will  begin  inserting  various  regions  of  the  botA  gene  into  pMTLSOOF. 
Constructs  will  be  made  in  E.  coli,  where  any  deieterious  effects  which  may  be 
associated  with  expression  in  this  organism  may  be  avoided  by  repressing 
transcription  from  fac  by  using  a  lad*  host  (ie.,  containing  pNM52,  Gilbert  et 
al.,  1986).  Once  obtained,  plasmids  will  be  transformed  into  SBA9::lad  where 
expression  may  be  elicited  by  the  addition  of  IPTG. 

In  general  terms  the  objectives  with  regard  to  expression  system  development  remain  on 
schedule.  Thus,  although  there  has  been  a  change  of  host  organism,  the  current  status  exactly 
matches  that  alluded  to  in  the  grant  application  SOWs.  On  the  bot  gene  cloning  programme, 
progress  is  significantly  ahead  of  schedule,  a  situation  largely  due  to  the  participation  of 
additional  manpower  resource. 
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